Apache Bahir provides streaming connectors and SQL data sources for Apache Spark and Apache Flink in a centralized location. It contains connectors for ActiveMQ, Akka, Flume, InfluxDB, Kudu, Netty, Redis, CouchDB, Cloudant, MQTT, and Twitter. Bahir is an important project because it enables reuse of extensions and saves time and money compared to recreating connectors. Though small, it covers multiple Spark and Flink extensions with the potential for future extensions. The project is currently active with regular updates to the GitHub repository and comprehensive documentation for its connectors.
Boost Fertility New Invention Ups Success Rates.pdf
Apache Bahir
1. What Is Apache Bahir ?
● Provides extensions for Apache Spark and Apache Flink
● Open source / Apache 2.0 license
● Streaming connectors and SQL data sources
● One grouped location for extensions
● Initiated in 2016 from Spark project
● A source for current and future extensions
3. Apache Bahir Spark Extensions
● SQL Data Sources
– Apache CouchDB/Cloudant data source
● Structured Streaming Data Sources
– Akka data source
– MQTT data source (new Sink)
5. Apache Bahir Importance
● Seems like a small project ? But it covers
– Multiple Spark extensions
– Multiple Flink extensions
– Possible future extensions
● Why is it important ?
– Knowledge of this project …
– Aids reuse, avoids the need to recreate connectors
– Saves money and time !
6. Apache Bahir Status
● OK great project but is it current ?
● Started in 2016 but is it still going ?
● Check Github
● https://github.com/apache/bahir-flink
– Last update 27/05/2020 => current
● https://github.com/apache/bahir
– Last update 20/01/2020 => current
7. Apache Bahir Documentation
● Flink connector documentation describes
– Dependencies
– Version compatibility
– Source and sink classes
– Linking for cluster execution
8. Apache Bahir Documentation
● Spark connector documentation describes
– Linking
– Configuration
– Examples
● Scala
● Java
● Python
● Taking MQTT as an example
● Documentation is comprehensive
9. Available Books
● See “Big Data Made Easy”
– Apress Jan 2015
●
See “Mastering Apache Spark”
– Packt Oct 2015
●
See “Complete Guide to Open Source Big Data Stack
– “Apress Jan 2018”
● Find the author on Amazon
– www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
●
Connect on LinkedIn
– www.linkedin.com/in/mike-frampton-38563020
10. Connect
● Feel free to connect on LinkedIn
– www.linkedin.com/in/mike-frampton-38563020
● See my open source blog at
– open-source-systems.blogspot.com/
● I am always interested in
– New technology
– Opportunities
– Technology based issues
– Big data integration