3H5 is the first in a number of sessions where we walk through the Apache Hop concepts, architecture and functionality. Check the schedule:
http://hop.apache.org/community/events/
We start with Hop's concepts, tools and installation before we start building pipelines. As a 3Hx session, this is as interactive as possible with plenty of time for Q&A.
Download Hop:
http://hop.apache.org/download/
Slides:
https://s.apache.org/f98hd
Some of the external plugins discussed:
CPython plugin:
https://github.com/m-a-hall/hop-cpython
Hop Machine Intelligence (Weka, Python scikit-learn, R MLR, Spark MLlib, DL4j):
https://github.com/m-a-hall/hop-mi
https://hop.apache.org
Check the full 3Hx schedule:
http://hop.apache.org/community/events/
Join our chat
https://chat.project-hop.org
Follow Hop Twitter:
https://twitter.com/ApacheHop
Follow Hop on LinkedIn:
https://www.linkedin.com/company/ApacheHop
00:00 Hop overview: history, concepts, download & install
25:05 demo: Hop Gui, create pipelines and workflows
35:50 Q&A 1: Python and R Integration
38:15 Q&A 2: Web service integration
43:00 Q&A 3: Hive, Hadoop integration
50:10 Q&A 4: logging & monitoring
55:15 Q&A 5: Avro, Parquet, ORC
59:35 Q&A 6: bug & feature tickets (JIRA)
1:05:25 Q&A 7: pushdown SQL to Snowflake etc
1:09:55 Q&A 8: upload changes to Hop Server
1:11:45 Q&A 9: Hop build pipeline, CI/CD
1:15:10 Q&A 10: R integration options
Тэги:
#apache_hop #apachehop #data_engineering #data_orchestration #pdi #kettle