HadoopÊÇ´óÊý¾ÝÁìÓò×îÁ÷Ðеļ¼Êõ£¬µ«²¢·ÇΨһ¡£»¹ÓкܶàÆäËû¼¼Êõ¿ÉÓÃÓÚ½â¾ö´óÊý¾ÝÎÊÌâ¡£³ýÁËApache
HadoopÍ⣬ÁíÍâ9¸ö´óÊý¾Ý¼¼ÊõÒ²ÊDZØÐëÒªÁ˽âµÄ¡£
1.Apache Flink
2.Apache Samza
3.Google Cloud Data Flow
4.StreamSets
5.Tensor Flow
6.Apache NiFi
7.Druid
8.LinkedIn WhereHows
9.Microsoft Cognitive Services
Apache Flink£ºÊÇÒ»¸ö¸ßЧ¡¢·Ö²¼Ê½¡¢»ùÓÚJavaʵÏÖµÄͨÓôóÊý¾Ý·ÖÎöÒýÇæ£¬Ëü¾ßÓзֲ¼Ê½MapReduceÒ»ÀàÆ½Ì¨µÄ¸ßЧÐÔ¡¢Áé»îÐÔºÍÀ©Õ¹ÐÔÒÔ¼°²¢ÐÐÊý¾Ý¿â²éѯÓÅ»¯·½°¸£¬ËüÖ§³ÖÅúÁ¿ºÍ»ùÓÚÁ÷µÄÊý¾Ý·ÖÎö£¬ÇÒÌṩÁË»ùÓÚJavaºÍScalaµÄAPI¡£
ÕâÊÇÒ»ÖÖÓÉÉçÇøÇý¶¯µÄ·Ö²¼Ê½´óÊý¾Ý·ÖÎö¿ªÔ´¿ò¼Ü£¬ÀàËÆÓÚApache HadoopºÍApache Spark¡£ËüµÄÒýÇæ¿É½èÖúÊý¾ÝÁ÷ºÍÄÚ´æÖÐ(in-memory)´¦ÀíÓëµü´ú²Ù×÷¸ÄÉÆÐÔÄÜ¡£Ä¿Ç°Apache
FlinkÒѳÉΪһ¸ö¶¥¼¶ÏîÄ¿(Top Level Project,TLP)£¬ÓÚ2014Äê4Ô±»ÄÉÈëApache·õ»¯Æ÷£¬Ä¿Ç°ÔÚÈ«Çò·¶Î§ÄÚÓкܶ๱Ï×Õß¡£
FlinkÊܵ½ÁËMPPÊý¾Ý¿â¼¼Êõ£¨Declaratives¡¢Query
Optimizer¡¢Parallel in-memory¡¢out-of-core Ëã·¨£©ºÍHadoop
MapReduce¼¼Êõ£¨Massive scale out, User Defined functions,
Schema on Read£©µÄÆô·¢£¬Óкܶà¶ÀÌØ¹¦ÄÜ£¨Streaming, Iterations, Dataflow,
General API£©¡£
Apache Samza£ºÊÇÒ»¸ö¿ªÔ´¡¢·Ö²¼Ê½µÄÁ÷´¦Àí¿ò¼Ü£¬ËüʹÓÿªÔ´·Ö²¼Ê½ÏûÏ¢´¦ÀíϵͳApache
KafkaÀ´ÊµÏÖÏûÏ¢·þÎñ£¬²¢Ê¹ÓÃ×ÊÔ´¹ÜÀíÆ÷Apache Hadoop YarnʵÏÖÈÝ´í´¦Àí¡¢´¦ÀíÆ÷¸ôÀë¡¢°²È«ÐÔºÍ×ÊÔ´¹ÜÀí¡£
¸Ã¼¼ÊõÓÉLinkedIn¿ª·¢£¬×î³õÄ¿µÄÊÇΪÁ˽â¾öApache KafkaÔÚÀ©Õ¹ÄÜÁ¦·½Ãæ´æÔÚµÄÎÊÌ⣬°üº¬ÖîÈçSimple
API¡¢Managed state¡¢Fault Tolerant¡¢Durable messaging¡¢Scalable¡¢Extensible£¬ÒÔ¼°Processor
IsolationµÈ¹¦ÄÜ¡£

SamzaµÄ´úÂë¿É×÷ΪYarn×÷ÒµÔËÐУ¬»¹¿ÉÒÔʵʩStreamTask½Ó¿Ú£¬½è´Ë¶¨Òåprocess()µ÷Óá£StreamTask¿ÉÒÔÔÚÈÎÎñʵÀýÄÚ²¿ÔËÐУ¬Æä±¾ÉíҲλÓÚÒ»¸öYarnÈÝÆ÷ÄÚ¡£
Cloud Dataflow£ºDataflowÊÇÒ»ÖÖÔÉúµÄGoogle
CloudÊý¾Ý´¦Àí·þÎñ£¬ÊÇÒ»ÖÖ¹¹½¨¡¢¹ÜÀíºÍÓÅ»¯¸´ÔÓÊý¾ÝÁ÷Ë®Ïߵķ½·¨£¬ÓÃÓÚ¹¹½¨Òƶ¯Ó¦Ó㬵÷ÊÔ¡¢×·×ÙºÍ¼à¿Ø²úÆ·¼¶ÔÆÓ¦Óá£Ëü²ÉÓÃÁËGoogleÄÚ²¿µÄ¼¼ÊõFlumeºÍMillWhell£¬ÆäÖÐFlumeÓÃÓÚÊý¾ÝµÄ¸ßЧ²¢Ðл¯´¦Àí£¬¶øMillWhellÔòÓÃÓÚ»¥ÁªÍø¼¶±ðµÄ´øÓкܺÃÈÝ´í»úÖÆµÄÁ÷´¦Àí¡£
¸Ã¼¼ÊõÌṩÁ˼òµ¥µÄ±à³ÌÄ£ÐÍ£¬¿ÉÓÃÓÚÅú´¦ÀíºÍÁ÷ʽÊý¾ÝµÄ´¦ÀíÈÎÎñ¡£¸Ã¼¼ÊõÌṩµÄÊý¾ÝÁ÷¹ÜÀí·þÎñ¿É¿ØÖÆÊý¾Ý´¦Àí×÷ÒµµÄÖ´ÐУ¬Êý¾Ý´¦Àí×÷Òµ¿ÉʹÓÃData
Flow SDK(Apache Beam)´´½¨¡£

Google Data FlowΪÊý¾ÝÏà¹ØµÄÈÎÎñÌṩÁ˹ÜÀí¡¢¼àÊӺͰ²È«ÄÜÁ¦¡£SourcesºÍSink¿ÉÔÚ¹ÜÏßÖгéÏóµØÖ´ÐжÁд²Ù×÷£¬¹ÜÏß·â×°¶ø³ÉµÄÕû¸ö¼ÆËãÐòÁпÉÒÔ½ÓÊÜÍⲿÀ´Ô´µÄijЩÊäÈëÊý¾Ý£¬Í¨¹ý¶ÔÊý¾Ý½øÐÐת»»Éú³ÉÒ»¶¨µÄÊä³öÊý¾Ý¡£
StreamSets£ºStreamSetsÊÇÒ»ÖÖרÃÅÕë¶Ô´«ÊäÖÐÊý¾Ý½øÐйýÓÅ»¯µÄÊý¾Ý´¦ÀíÆ½Ì¨£¬ÌṩÁË¿ÉÊÓ»¯Êý¾ÝÁ÷´´½¨Ä£ÐÍ£¬Í¨¹ý¿ªÔ´µÄ·½Ê½·¢ÐС£¸Ã¼¼Êõ¿É²¿ÊðÔÚÄÚ²¿»·¾³»òÔÆÖУ¬ÌṩÁ˷ḻµÄ¼àÊӺ͹ÜÀí½çÃæ¡£
Êý¾ÝÊÕ¼¯Æ÷¿ÉʹÓÃÊý¾Ý¹ÜÏßʵʱµØÁ÷ʽ´«Êä²¢´¦ÀíÊý¾Ý£¬¹ÜÏßÃèÊöÁËÊý¾Ý´ÓÔ´Í·µ½×îÖÕÄ¿±êµÄÁ÷¶¯·½Ê½£¬¿É°üº¬À´Ô´¡¢Ä¿±ê£¬ÒÔ¼°´¦Àí³ÌÐò¡£Êý¾ÝÊÕ¼¯Æ÷µÄÉúÃüÖÜÆÚ¿Éͨ¹ý¹ÜÀí¿ØÖÆÌ¨½øÐпØÖÆ¡£
TensorFlow£ºÊǼÌDistBeliefÖ®ºóµÄµÚ¶þ´ú»úÆ÷ѧϰϵͳ¡£TensorFlowÔ´×ÔGoogleÆìϵÄGoogle
BrainÏîÄ¿£¬Ö÷ҪĿ±êÔÚÓÚΪGoogleÈ«¹«Ë¾µÄ²»Í¬²úÆ·ºÍ·þÎñÓ¦Óø÷ÖÖÀàÐ͵ÄÉñ¾ÍøÂç»úÆ÷ѧϰÄÜÁ¦¡£
Ö§³Ö·Ö²¼Ê½¼ÆËãµÄTensorFlowÄܹ»Ê¹Óû§ÔÚ×Ô¼ºµÄ»úÆ÷ѧϰ»ù´¡½á¹¹ÖÐѵÁ··Ö²¼Ê½Ä£ÐÍ¡£¸ÃϵͳÒÔ¸ßÐÔÄܵÄgRPCÊý¾Ý¿âΪ֧³Å£¬Óë×î½ü·¢²¼µÄGoogleÔÆ»úÆ÷ѧϰϵͳ»¥²¹£¬Ê¹Óû§Äܹ»ÀûÓÃGoogleÔÆÆ½Ì¨£¬¶ÔTensorFlowÄ£ÐͽøÐÐѵÁ·²¢Ìṩ·þÎñ¡£
ÕâÊÇÒ»ÖÖ¿ªÔ´Èí¼þ¿â£¬¿ÉʹÓÃÊý¾ÝÁ÷ͼÆ×(data flow graph)½øÐÐÊýÖµÔËË㣬ÕâÖÖ¼¼ÊõÒѱ»°üÀ¨DeepDream¡¢RankBrain¡¢Smart
ReplyusedÔÚÄڵĸ÷ÖÖGoogleÏîÄ¿ËùʹÓá£

Êý¾ÝÁ÷ͼÆ×ʹÓÃÓɽڵã(Node)ºÍ±ßÔµ(Edge)×é³ÉµÄÓÐÏòͼ(Directed
graph)ÃèÊöÊýÖµÔËË㡣ͼÆ×ÖеĽڵã´ú±íÊýÖµÔËË㣬±ßÔµ´ú±í¸ºÔðÔÚ½ÚµãÖ®¼ä½øÐÐͨÐŵĶàάÊý¾ÝÕóÁÐ(ÕÅÁ¿£¬Tensor)¡£±ßÔµ»¹ÃèÊöÁ˽ڵãÖ®¼äµÄÊäÈë/Êä³ö¹ØÏµ¡£¡°TensorFlow¡±Õâ¸öÃû³ÆÔ̺¬ÁËÕÅÁ¿ÔÚͼÆ×ÉÏÁ÷¶¯µÄº¬Òå¡£
Druid£ºDruidÊÇÒ»¸öÓÃÓÚ´óÊý¾Ýʵʱ²éѯºÍ·ÖÎöµÄ¸ßÈÝ´í¡¢¸ßÐÔÄÜ¿ªÔ´·Ö²¼Ê½ÏµÍ³£¬Ö¼ÔÚ¿ìËÙ´¦Àí´ó¹æÄ£µÄÊý¾Ý£¬²¢Äܹ»ÊµÏÖ¿ìËÙ²éѯºÍ·ÖÎö£¬µ®ÉúÓÚ2011Ä꣬°üº¬ÖîÈçÇý¶¯½»»¥Ê½Êý¾ÝÓ¦ÓóÌÐò£¬¶à×â»§£º´óÁ¿²¢·¢Óû§£¬À©Õ¹ÄÜÁ¦£ºÃ¿ÌìÉÏÍòÒÚʼþ£¬´ÎÃë¼¶²éѯ£¬ÊµÊ±·ÖÎöµÈ¹¦ÄÜ¡£Druid»¹°üº¬Ò»Ð©ÌØÊâµÄÖØÒª¹¦ÄÜ£¬ÀýÈçµÍÑÓ³ÙÊý¾ÝÉãÈë¡¢¿ìËپۺϡ¢ÈÎÒâÇиîÄÜÁ¦¡¢¸ß¿ÉÓÃÐÔ¡¢½üËÆ¼ÆËãÓ뾫ȷ¼ÆËãµÈ¡£
´´½¨DruidµÄ×î³õÒâͼÖ÷ÒªÊÇΪÁ˽â¾ö²éѯÑÓ³ÙÎÊÌ⣬µ±Ê±ÊÔͼʹÓÃHadoopÀ´ÊµÏÖ½»»¥Ê½²éѯ·ÖÎö£¬µ«ÊǺÜÄÑÂú×ãʵʱ·ÖÎöµÄÐèÒª¡£¶øDruidÌṩÁËÒÔ½»»¥·½Ê½·ÃÎÊÊý¾ÝµÄÄÜÁ¦£¬²¢È¨ºâÁ˲éѯµÄÁé»îÐÔºÍÐÔÄܶø²ÉÈ¡ÁËÌØÊâµÄ´æ´¢¸ñʽ¡£
¸Ã¼¼Êõ»¹ÌṩÁËÆäËûʵÓù¦ÄÜ£¬ÀýÈçʵʱ½Úµã¡¢ÀúÊ·½Úµã¡¢Broker½Úµã¡¢Coordinator½Úµã¡¢Ê¹ÓûùÓÚJSON²éѯÓïÑÔµÄË÷Òý·þÎñ¡£
Apache NiFi£ºApache
NiFiÊÇÒ»Ì×Ç¿´ó¿É¿¿µÄÊý¾Ý´¦ÀíºÍ·Ö·¢ÏµÍ³£¬¿ÉÓÃÓÚ¶ÔÊý¾ÝµÄÁ÷תºÍת»»´´½¨ÓÐÏòͼ¡£½èÖú¸Ãϵͳ¿ÉÒÔÓÃͼÐνçÃæ´´½¨¡¢¼àÊÓ¡¢¿ØÖÆÊý¾ÝÁ÷£¬ÓзḻµÄÅäÖÃÑ¡Ïî¿É¹©Ê¹Ó㬿ÉÔÚÔËÐÐʱÐÞ¸ÄÊý¾ÝÁ÷£¬¶¯Ì¬´´½¨Êý¾Ý·ÖÇø¡£´ËÍ⻹¿ÉÒÔ¶ÔÊý¾ÝÔÚÕû¸öϵͳÄÚµÄÁ÷¶¯½øÐÐÊý¾ÝÆðÔ´¸ú×Ù¡£Í¨¹ý¿ª·¢×Ô¶¨Òå×é¼þ£¬»¹¿ÉÇáËÉ¶ÔÆä½øÐÐÀ©Õ¹¡£
Apache NiFiµÄÔËתÀë²»¿ªÖîÈçFlowFile¡¢Processor£¬ÒÔ¼°ConnectionµÈ¸ÅÄî¡£
LinkedIn WhereHows£ºWhereHowsÌṩ´øÔªÊý¾ÝËÑË÷µÄÆóÒµ±à¼(Enterprise
catalog)£¬¿ÉÒÔÈÃÄúÁ˽âÊý¾Ý´æ´¢ÔÚÄÄÀÊÇÈçºÎ±£´æµ½ÄÇÀïµÄ¡£¸Ã¹¤¾ß¿ÉÌṩÐ×÷¡¢Êý¾ÝѪͳ·ÖÎöµÈ¹¦ÄÜ£¬²¢¿ÉÁ¬½ÓÖÁ¶àÖÖÊý¾ÝÔ´ºÍÌáÈ¡¡¢¼ÓÔØºÍת»»(ETL)¹¤¾ß¡£
¸Ã¹¤¾ßΪÊý¾Ý·¢ÏÖÌṩÁËWeb½çÃæ£¬Ö§³ÖAPIµÄºó¶Ë·þÎñÆ÷¸ºÔð¿ØÖÆÔªÊý¾ÝµÄÅÀÍø(Crawling)ÒÔ¼°ÓëÆäËûϵͳµÄ¼¯³É¡£
Microsoft Cognitive
Services£º¸Ã¼¼ÊõÔ´×ÔProject OxfordºÍBing£¬ÌṩÁË22ÖÖÈÏÖª¼ÆËãAPI£¬Ö÷Òª·ÖÀà°üÀ¨£ºÊÓ¾õ¡¢ÓïÒô¡¢ÓïÑÔ¡¢ÖªÊ¶£¬ÒÔ¼°ËÑË÷¡£¸Ã¼¼ÊõÒѼ¯³ÉÓÚCortana
Intelligence Suite¡£
ÕâÊÇÒ»ÖÖ¿ªÔ´¼¼Êõ£¬ÌṩÁË22ÖÖ²»Í¬µÄÈÏÖª¼ÆËãREST API£¬²¢Îª¿ª·¢ÕßÌṩÁËÊÊÓÃÓÚWindows¡¢IOS¡¢AndroidÒÔ¼°PythonµÄSDK¡£
|