Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
ʹÓÃApache Spark¹¹½¨ÊµÊ±·ÖÎöDashboard
 
À´Ô´£ºinfoqQ ·¢²¼ÓÚ£º 2016-12-9
  2665  次浏览      27
 

±¾ÆªÎÄÕÂÖÐÎÒÃǽ«Ñ§Ï°ÈçºÎʹÓÃApache Spark streaming£¬Kafka£¬Node.js£¬Socket.IOºÍHighcharts¹¹½¨ÊµÊ±·ÖÎöDashboard¡£

ÎÊÌâÃèÊö

µç×ÓÉÌÎñÃÅ»§Ï£Íû¹¹½¨Ò»¸öʵʱ·ÖÎöÒDZíÅÌ£¬¶Ôÿ·ÖÖÓ·¢»õµÄ¶©µ¥ÊýÁ¿×öµ½¿ÉÊÓ»¯£¬´Ó¶øÓÅ»¯ÎïÁ÷µÄЧÂÊ¡£

½â¾ö·½°¸

½â¾ö·½°¸Ö®Ç°£¬ÏÈ¿ìËÙ¿´¿´ÎÒÃǽ«Ê¹ÓõŤ¾ß£º

Apache Spark¨C Ò»¸öͨÓõĴó¹æÄ£Êý¾Ý¿ìËÙ´¦ÀíÒýÇæ¡£SparkµÄÅú´¦ÀíËٶȱÈHadoop MapReduce¿ì½ü10±¶£¬¶øÄÚ´æÖеÄÊý¾Ý·ÖÎöËÙ¶ÈÔò¿ì½ü100±¶¡£ ¸ü¶à ¹ØÓÚApache SparkµÄÐÅÏ¢¡£

Python¨C PythonÊÇÒ»Öֹ㷺ʹÓõĸ߼¶£¬Í¨Ó㬽âÊÍ£¬¶¯Ì¬±à³ÌÓïÑÔ¡£ ¸ü¶à ¹ØÓÚPythonµÄÐÅÏ¢¡£

Kafka¨C Ò»¸ö¸ßÍÌÍÂÁ¿£¬·Ö²¼Ê½ÏûÏ¢·¢²¼¶©ÔÄϵͳ¡£ ¸ü¶à ¹ØÓÚKafkaµÄÐÅÏ¢¡£

Node.js¨C »ùÓÚʼþÇý¶¯µÄI/O·þÎñÆ÷¶ËJavaScript»·¾³£¬ÔËÐÐÔÚV8ÒýÇæÉÏ¡£ ¸ü¶à ¹ØÓÚNode.jsµÄÐÅÏ¢¡£

Socket.io¨C Socket.IOÊÇÒ»¸ö¹¹½¨ÊµÊ±WebÓ¦ÓóÌÐòµÄJavaScript¿â¡£ËüÖ§³ÖWeb¿Í»§¶ËºÍ·þÎñÆ÷Ö®¼äµÄʵʱ¡¢Ë«ÏòͨÐÅ¡£ ¸ü¶à ¹ØÓÚSocket.ioµÄÐÅÏ¢¡£

Highcharts¨C ÍøÒ³ÉϽ»»¥Ê½JavaScriptͼ±í¡£ ¸ü¶à ¹ØÓÚHighchartsµÄÐÅÏ¢¡£

CloudxLab¨C Ìṩһ¸öÕæÊµµÄ»ùÓÚÔÆµÄ»·¾³£¬ÓÃÓÚÁ·Ï°ºÍѧϰ¸÷ÖÖ¹¤¾ß¡£Äã¿ÉÒÔͨ¹ýÔÚÏß ×¢²á Á¢¼´¿ªÊ¼Á·Ï°¡£

ÈçºÎ¹¹½¨Êý¾ÝPipeline?

ÏÂÃæÊÇÊý¾ÝPipeline¸ß²ã¼Ü¹¹Í¼

Êý¾ÝPipeline

ʵʱ·ÖÎöDashboard

ÈÃÎÒÃÇ´ÓÊý¾ÝPipelineÖеÄÿ¸ö½×¶ÎµÄÃèÊö¿ªÊ¼£¬²¢Íê³É½â¾ö·½°¸µÄ¹¹½¨¡£

½×¶Î1

µ±¿Í»§¹ºÂòϵͳÖеÄÎïÆ·»ò¶©µ¥¹ÜÀíϵͳÖеĶ©µ¥×´Ì¬±ä»¯Ê±£¬ÏàÓ¦µÄ¶©µ¥IDÒÔ¼°¶©µ¥×´Ì¬ºÍʱ¼ä½«±»ÍÆË͵½ÏàÓ¦µÄKafkaÖ÷ÌâÖС£

Êý¾Ý¼¯

ÓÉÓÚûÓÐÕæÊµµÄÔÚÏßµç×ÓÉÌÎñÃÅ»§ÍøÕ¾£¬ÎÒÃÇ×¼±¸ÓÃCSVÎļþµÄÊý¾Ý¼¯À´Ä£Äâ¡£ÈÃÎÒÃÇ¿´¿´Êý¾Ý¼¯£º

DateTime, OrderId, Status
2016-07-13 14:20:33,xxxxx-xxx,processing
2016-07-13 14:20:34,xxxxx-xxx,shipped
2016-07-13 14:20:35,xxxxx-xxx,delivered

Êý¾Ý¼¯°üº¬ÈýÁзֱðÊÇ£º¡°DateTime¡±¡¢¡°OrderId¡±ºÍ¡°Status¡±¡£Êý¾Ý¼¯ÖеÄÿһÐбíÊ¾ÌØ¶¨Ê±¼äʱ¶©µ¥µÄ״̬¡£ÕâÀïÎÒÃÇÓá°xxxxx-xxx¡±´ú±í¶©µ¥ID¡£ÎÒÃÇÖ»¶Ôÿ·ÖÖÓ·¢»õµÄ¶©µ¥Êý¸ÐÐËȤ£¬ËùÒÔ²»ÐèҪʵ¼ÊµÄ¶©µ¥ID¡£

Êý¾Ý¼¯Î»ÓÚÏîÄ¿µÄ spark-streaming/data/order_data Îļþ¼ÐÖС£

ÍÆËÍÊý¾Ý¼¯µ½Kafka

shell½Å±¾½«´ÓÕâЩCSVÎļþÖзֱð»ñȡÿһÐв¢ÍÆË͵½Kafka¡£ÍÆËÍÍêÒ»¸öCSVÎļþµ½KafkaÖ®ºó£¬ÐèÒªµÈ´ý1·ÖÖÓÔÙÍÆËÍÏÂÒ»¸öCSVÎļþ£¬ÕâÑù¿ÉÒÔÄ£Äâʵʱµç×ÓÉÌÎñÃÅ»§»·¾³£¬Õâ¸ö»·¾³ÖеĶ©µ¥×´Ì¬ÊÇÒÔ²»Í¬µÄʱ¼ä¼ä¸ô¸üеġ£ÔÚÏÖʵÊÀ½çµÄÇé¿öÏ£¬µ±¶©µ¥×´Ì¬¸Ä±äʱ£¬ÏàÓ¦µÄ¶©µ¥ÏêϸÐÅÏ¢»á±»ÍÆË͵½Kafka¡£

ÔËÐÐÎÒÃǵÄshell½Å±¾½«Êý¾ÝÍÆË͵½KafkaÖ÷ÌâÖС£µÇ¼µ½ CloudxLab Web¿ØÖÆÌ¨ ²¢ÔËÐÐÒÔÏÂÃüÁî¡£

# Clone the repository

git clone https://github.com/singhabhinav/cloudxlab.git

# Create the order-data topic in Kafka

export PATH=$PATH:/usr/hdp/current/kafka-broker/bin
kafka-topics.sh --create --zookeeper localhost:2181
--replication-factor 1 --partitions 1 --topic order-data

# Go to Kafka directory

cd cloudxlab/spark-streaming/kafka

# Run the Script for pushing data to Kafka topic
# ip-172-31-13-154.ec2.internal is the hostname of broker.
# Find list of brokers in Ambari (a.cloudxlab.com:8080).
# Use hostname of any one of the brokers
# order-data is the Kafka topic

/bin/bash put_order_data_in_topic.sh ../data/order_data/
ip-172-31-13-154.ec2.internal:6667 order-data

# Script will push CSV files one by one to Kafka topic
after every one minute interval

# Let the script run. Do not close the terminal

½×¶Î2

ÔÚµÚ1½×¶Îºó£¬Kafka¡°order-data¡±Ö÷ÌâÖеÄÿ¸öÏûÏ¢¶¼½«ÈçÏÂËùʾ

2016-07-13 14:20:33,xxxxx-xxx,processing

½×¶Î3

Spark streaming´úÂ뽫ÔÚ60ÃëµÄʱ¼ä´°¿ÚÖдӡ°order-data¡±µÄKafkaÖ÷Ìâ»ñÈ¡Êý¾Ý²¢´¦Àí£¬ÕâÑù¾ÍÄÜÔÚ¸Ã60Ãëʱ¼ä´°¿ÚÖÐΪÿÖÖ״̬µÄ¶©µ¥¼ÆÊý¡£´¦Àíºó£¬Ã¿ÖÖ״̬¶©µ¥µÄ×ܼÆÊý±»ÍÆË͵½¡°order-one-min-data¡±µÄKafkaÖ÷ÌâÖС£

ÇëÔÚWeb¿ØÖÆÌ¨ÖÐÔËÐÐÕâЩSpark streaming´úÂë

# Login to CloudxLab web console in the second tab

# Create order-one-min-data Kafka topic

export PATH=$PATH:/usr/hdp/current/kafka-broker/bin
kafka-topics.sh --create --zookeeper localhost:2181
--replication-factor 1 --partitions 1 --topic order-one-min-data

# Go to spark directory

cd cloudxlab/spark-streaming/spark

# Run the Spark Streaming code

spark-submit --jars spark-streaming-kafka-assembly_2.10-1.6.0.jar
spark_streaming_order_status.py localhost:2181 order-data

# Let the script run. Do not close the terminal

½×¶Î4

ÔÚÕâ¸ö½×¶Î£¬KafkaÖ÷Ìâ¡°order-one-min-data¡±ÖеÄÿ¸öÏûÏ¢¶¼½«ÀàËÆÓÚÒÔÏÂJSON×Ö·û´®

{
"shipped": 657,
"processing": 987,
"delivered": 1024
}

½×¶Î5

ÔËÐÐNode.js server

ÏÖÔÚÎÒÃǽ«ÔËÐÐÒ»¸önode.js·þÎñÆ÷À´Ê¹Óá°order-one-min-data¡±KafkaÖ÷ÌâµÄÏûÏ¢£¬²¢½«ÆäÍÆË͵½Webä¯ÀÀÆ÷£¬ÕâÑù¾Í¿ÉÒÔÔÚWebä¯ÀÀÆ÷ÖÐÏÔʾ³öÿ·ÖÖÓ·¢»õµÄ¶©µ¥ÊýÁ¿¡£

ÇëÔÚWeb¿ØÖÆÌ¨ÖÐÔËÐÐÒÔÏÂÃüÁîÒÔÆô¶¯node.js·þÎñÆ÷

# Login to CloudxLab web console in the third tab

# Go to node directory

cd cloudxlab/spark-streaming/node

# Install dependencies as specified in package.json

npm install

# Run the node server

node index.js

# Let the server run. Do not close the terminal

ÏÖÔÚnode·þÎñÆ÷½«ÔËÐÐÔÚ¶Ë¿Ú3001ÉÏ¡£Èç¹ûÔÚÆô¶¯node·þÎñÆ÷ʱ³öÏÖ¡°EADDRINUSE¡±´íÎó£¬Çë±à¼­index.jsÎļþ²¢½«¶Ë¿ÚÒÀ´Î¸ü¸ÄΪ3002...3003...3004µÈ¡£ÇëʹÓÃ3001-3010·¶Î§ÄÚµÄÈÎÒâ¿ÉÓö˿ÚÀ´ÔËÐÐnode·þÎñÆ÷¡£

ÓÃä¯ÀÀÆ÷·ÃÎÊ

Æô¶¯node·þÎñÆ÷ºó£¬Çëתµ½ http://YOUR_WEB_CONSOLE:PORT_NUMBER ·ÃÎÊʵʱ·ÖÎöDashboard¡£Èç¹ûÄúµÄWeb¿ØÖÆÌ¨ÊÇf.cloudxlab.com£¬²¢ÇÒnode·þÎñÆ÷ÕýÔÚ¶Ë¿Ú3002ÉÏÔËÐУ¬Çëתµ½ http://f.cloudxlab.com:3002 ·ÃÎÊDashboard¡£

µ±ÎÒÃÇ·ÃÎÊÉÏÃæµÄURLʱ£¬socket.io-client¿â±»¼ÓÔØµ½ä¯ÀÀÆ÷£¬Ëü»á¿ªÆô·þÎñÆ÷ºÍä¯ÀÀÆ÷Ö®¼äµÄË«ÏòͨÐÅÐŵÀ¡£

½×¶Î6

Ò»µ©ÔÚKafkaµÄ¡°order-one-min-data¡±Ö÷ÌâÖÐÓÐÐÂÏûÏ¢µ½´ï£¬node½ø³Ì¾Í»áÏû·ÑËü¡£Ïû·ÑµÄÏûÏ¢½«Í¨¹ýsocket.io·¢Ë͸øWebä¯ÀÀÆ÷¡£

½×¶Î7

Ò»µ©webä¯ÀÀÆ÷ÖеÄsocket.io-client½ÓÊÕµ½Ò»¸öеġ°message¡±Ê¼þ£¬Ê¼þÖеÄÊý¾Ý½«»á±»´¦Àí¡£Èç¹û½ÓÊÕµÄÊý¾ÝÖеĶ©µ¥×´Ì¬ÊÇ¡°shipped¡±£¬Ëü½«»á±»Ìí¼Óµ½HighCharts×ø±êϵÉϲ¢ÏÔʾÔÚä¯ÀÀÆ÷ÖС£

½ØÍ¼

ÎÒÃÇÒѳɹ¦¹¹½¨ÊµÊ±·ÖÎöDashboard¡£ÕâÊÇÒ»¸ö»ù±¾Ê¾Àý£¬ÑÝʾÈçºÎ¼¯³ÉSpark-streaming£¬Kafka£¬node.jsºÍsocket.ioÀ´¹¹½¨ÊµÊ±·ÖÎöDashboard¡£ÏÖÔÚ£¬ÓÉÓÚÓÐÁËÕâЩ»ù´¡ÖªÊ¶£¬ÎÒÃǾͿÉÒÔʹÓÃÉÏÊö¹¤¾ß¹¹½¨¸ü¸´ÔÓµÄϵͳ¡£

   
2665 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]

APPÍÆ¹ãÖ®ÇÉÓù¤¾ß½øÐÐÊý¾Ý·ÖÎö
Hadoop Hive»ù´¡sqlÓï·¨
Ó¦Óö༶»º´æÄ£Ê½Ö§³Åº£Á¿¶Á·þÎñ
HBase ³¬Ïêϸ½éÉÜ
HBase¼¼ÊõÏêϸ½éÉÜ
Spark¶¯Ì¬×ÊÔ´·ÖÅä

HadoopÓëSpark´óÊý¾Ý¼Ü¹¹
HadoopÔ­ÀíÓë¸ß¼¶Êµ¼ù
HadoopÔ­Àí¡¢Ó¦ÓÃÓëÓÅ»¯
´óÊý¾ÝÌåϵ¿ò¼ÜÓëÓ¦ÓÃ
´óÊý¾ÝµÄ¼¼ÊõÓëʵ¼ù
Spark´óÊý¾Ý´¦Àí¼¼Êõ

GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí