Weitere Àhnliche Inhalte
Mehr von SingleStore (20)
KĂŒrzlich hochgeladen (20)
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
- 4. massively parallel, lock free, FAST
distributed SQL database
in-memory, on-disk
ACID
JSON and geospatial
transactions and analytics
- 12. from pystreamliner.api import Extractor
class CustomExtractor(Extractor):
def initialize(self, streaming_context,
sql_context, config, interval, logger):
logger.info("Initialized Extractor")
def next(self, streaming_context, time,
sql_context, config, interval, logger):
rdd = streaming_context._sc.parallelize([[x]
for x in range(10)])
return sql_context.createDataFrame(rdd,
["number"])
- 15. > memsql-ops pip install [package]
distributed cluster-wide
any Python package
bring your own