留学生作业代写 FIT5202 Big Data Processing

Session 09
FIT5202 Big Data Processing
Data Streaming using Apache Kafka and 09 Agenda
• Session 08 Review

Copyright By PowCoder代写 加微信 powcoder

• Implicit vs Explicit Data
• Matrix Factorization
• Collaborative Filtering with ALS
• Streaming using Apache Kafka
• Visualizing in real-time
• Use case : Click stream visualization
• Demo : word count
• Lab Task : Click Stream Analysis and Visualization

Kafka Use Case (Traffic Data Monitoring)
https://www.infoq.com/articles/traffic-data-monitoring- iot-kafka-and-spark-streaming/

What is Apache Kafka?
• Publish-subscribe messaging system
• Enables distributed applications
• Brokers utilize Apache ZooKeeper for management
and coordination of the cluster
• Each broker instance is capable of handling read
and write quantities reaching to the hundreds of thousands each second (and terabytes of messages) without any impact on performance.
https://www.cloudkarafka.com/blog/2016-11-30-part1-kafka-for-beginners-what-is-apache-kafka.html https://www.instaclustr.com/apache-kafka-architecture/

DEMO Kafka Implementation Scenarios for Lab Visualize in Real-time
Aggregation + Visualization
Week8-Topic
Multiple graphs, annotating interesting points
Rolling Mean + Visualization
Real-time visualization Multiple aggregations Multiple graphs
Producer (Multiple Producers)
Week8-Topic1

and Consumer Properties
▪ KafkaProducer
• Bootstrap_servers • Value_serializer
• Api_version
▪ KafkaConsumer
• Consumer_timeout_ms • Auto_offset_reset
• Bootstrap_servers
• Value_deserializer
• Api_version
https://kafka-python.readthedocs.io/en/master/apidoc/KafkaConsumer.html

Lab Task for .csv
Real time visualization
Total clicks/ Total impressions
clickstream

To be covered in Session 10 Lecture

DEMO Streaming Word Count Demo
words/sentences

Socket : 9999

Lab Task for Streaming Clickstream.csv
clickstream
Real time visualization
Clicks per minute

Thank You!
See you next week.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com