Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
SequoiaDB+SparkSQL ÔÚÊý¾Ýͳ¼Æ³¡¾°µÄÓ¦ÓÃ
 
 
  1502  次浏览      27
2020-4-10
 
±à¼­ÍƼö:
±¾ÎÄÖ÷ÒªÏò¶ÁÕßÃǽéÉÜSequoiaDB£¨·Ö²¼Ê½´æ´¢£©ºÍSpark£¨·Ö²¼Ê½¼ÆË㣩Á½¿î²úÆ·µÄ¶Ô½ÓʹÓã¬ÒÔ¼°½éÉÜÔÚº£Á¿Êý¾Ý³¡¾°ÏÂÈçºÎÌá¸ßͳ¼Æ·ÖÎöÐÔÄÜ¡£
±¾ÎÄÀ´×ÔSequoiaDB£¬ÓÉ»ðÁú¹ûÈí¼þAlice±à¼­¡¢ÍƼö¡£

ǰÑÔ

ÔÚµ±Ç°ÆóÒµÉú²úÊý¾ÝÅòÕ͵Äʱ´ú£¬Êý¾Ý¼´Ê¹ÆóÒµµÄ¼ÛÖµËùÔÚ£¬Ò²ÊÇÒ»¼ÒÆóÒµµÄ¼¼ÊõÌôÕ½ËùÔÚ¡£ËùÒÔÔÚº£Á¿Êý¾Ý´¦Àí³¡¾°ÉÏ£¬ÈËÃÇÒâʶµ½µ¥»ú¼ÆËãÄÜÁ¦ÔÙǿҲÎÞ·¨Âú×ãÈÕÒæÔö³¤µÄÊý¾Ý´¦ÀíÐèÇ󣬷ֲ¼Ê½²ÅÊǽâ¾ö¸ÃÀàÎÊÌâµÄ¸ù±¾½â¾ö·½°¸¡£

¶øÔÚ·Ö²¼Ê½ÁìÓò£¬ÓÐÁ½Àà²úÆ·ÊÇÖÁ¹ØÖØÒªµÄ£¬·Ö±ð·Ö²¼Ê½´æ´¢ºÍ·Ö²¼Ê½¼ÆË㣬Óû§Ö»Óн«Á½ÕßµÄÌØÐÔ³ä·ÖÀûÓ㬲ſÉÒÔÕæÕý·¢»Ó·Ö²¼Ê½¼Ü¹¹µÄ´æ´¢ºÍ¼ÆËãÄÜÁ¦¡£

SequoiaDB½éÉÜ

SequoiaDBÊǹúÄÚΪÊý²»¶àµÄ×ÔÖ÷Ñз¢µÄ·Ö²¼Ê½Êý¾Ý¿â£¬ÌصãÊÇͬʱ֧³ÖÎĵµ´æ´¢ºÍ¿é´æ´¢£¬Ö§³Ö±ê×¼SQLºÍÊÂÎñ¹¦ÄÜ£¬Ö§³Ö¸´ÔÓË÷Òý²éѯ¡¢ÓëHadoop¡¢Hive¡¢Spark¶¼ÓнÏÉî¶ÈµÄ¼¯³É¡£Ä¿Ç°SequoiaDBÒѾ­ÔÚGithub¿ªÔ´¡£

SequoiaDBÔÚ·Ö²¼Ê½´æ´¢¹¦ÄÜÉÏ£¬½ÏÒ»°ãµÄ´óÊý¾Ý²úÆ·Ìṩ¸ü¶àµÄÊý¾ÝÇзֹæÔò£¬°üÀ¨£ºË®Æ½Çз֡¢·¶Î§Çз֡¢¶àά·ÖÇø£¨ÀàËÆpartition·ÖÇø£©ºÍ¶àάÇзַ½Ê½£¬Óû§¿ÉÒÔ¸ù¾Ý²»Óõij¡¾°Ñ¡ÔñÏàÓ¦µÄÇзַ½Ê½£¬ÒÔÌá¸ßϵͳµÄ´æ´¢ÄÜÁ¦ºÍ²Ù×÷ÐÔÄÜ¡£

Spark ½éÉÜ

Spark ½üÄêÀ´·¢Õ¹ÌرðѸÃÍ£¬ÌرðÔÚÕýʽ·¢²¼Spark 1.0 °æ±¾ºó£¬µÃµ½ÁËÖÚ¶à¹è¹È¾ÞÍ·Ö§³Ö£¬ÀýÈ磺Cloudera¡¢IBM¡¢Hortonworks¡¢IntelµÈ£¬¶øÇÒÔÚSpark 2.0Ðû²¼Ö§³ÖTPC-DS99ºó£¬Ê¹ÓÃSparkSQL×ö´óÊý¾Ý´¦ÀíºÍ·ÖÎöµÄ¿ª·¢ÕßÔ½À´Ô½¶à£¬¿ÉÒÔÔ¤¼û£¬Spark½«»á³ÉΪ¼ÌHadoopÖ®ºó×îÖØÒªºÍÁ÷Ðеķֲ¼Ê½¼ÆËã¿ò¼Ü¡£

SparkSQL½éÉÜ

SparkSQLÊÇSpark²úÆ·ÖÐÒ»¸ö×é³É²¿·Ö£¬SQLµÄÖ´ÐÐÒýÇæÊ¹ÓÃSparkµÄRDDºÍDataframeʵÏÖ¡£Ä¿Ç°SparkSQLÒѾ­¿ÉÒÔÍêÕûÔËÐÐTPC-DS99²âÊÔ£¬±êÖ¾×ÅSparkSQLÔÚÊý¾Ý·ÖÎöºÍÊý¾Ý´¦Àí³¡¾°Éϼ¼Êõ½øÒ»²½³ÉÊì¡£

SparkSQLºÍÁíÍâÒ»¿îÁ÷ÐеĴóÊý¾ÝSQL²úÆ·--HiveÓÐÏàËÆÖ®´¦£¬ÀýÈçÁ½Õß¶¼Ê¹ÓÃThriftserver×÷ΪJDBC·þÎñ£¬Á½¸ö²úÆ·¶¼Ê¹ÓÃÏàͬµÄmetadata´úÂ루ʵ¼ÊÉÏSparkSQL¸´ÓÃÁËHiveµÄmetadata´úÂ룩¡£µ«ÊÇÁ½¿î²úÆ·»¹ÊÇÓб¾ÖÊÉϵÄÇø±ð£¬×î´óµÄ²»Í¬µãÔÚÓÚÖ´ÐÐÒýÇæ£¬HiveĬÈÏÖ§³ÖHadoopºÍTez¼ÆËã¿ò¼Ü£¬¶øSparkSQLÖ»Ö§³ÖSpark RDD¼ÆËã¿ò¼Ü£¬µ«ÊÇSparkSQLµÄÓµÓиü¼ÓÉî¶ÈµÄÖ´Ðмƻ®ÓÅ»¯ºÍ´¦ÀíÒýÇæÓÅ»¯¡£

SparkSQLÓëSequoiaDBÕûºÏ

Ô­Àí½éÉÜ

Á˽âSpark¼¼ÊõÔ­ÀíµÄ¶ÁÕßÃÇÓ¦¸ÃÇå³þ£¬Spark±¾ÉíÊÇÒ»¿î·Ö²¼Ê½¼ÆËã¿ò¼Ü¡£Ëü²»ÏñHadoopÒ»Ñù£¬Í¬Ê±Îª¿ª·¢ÕßÌṩ·Ö²¼Ê½¼ÆËãºÍ·Ö²¼Ê½´æ´¢£¬¶øÊÇ¿ª·ÅÁË´æ´¢²ãµÄ¿ª·¢½Ó¿Ú£¬Ö»Òª¿ª·¢Õß°´ÕÕSparkµÄ½Ó¿Ú¹æ·¶ÊµÏÖÁ˽ӿڷ½·¨£¬Èκδ洢²úÆ·¶¼¿ÉÒÔ³ÉΪSparkÊý¾Ý¼ÆËãµÄÀ´Ô´£¬Í¬Ê±Ò²°üÀ¨SparkSQLµÄÊý¾ÝÀ´Ô´¡£

SequoiaDBÊÇÒ»¿î·Ö²¼Ê½Êý¾Ý¿â£¬Äܹ»ÎªÓû§´æ´¢º£Á¿µÄÊý¾Ý£¬µ«ÊÇÈç¹ûÒª¶Ôº£Á¿Êý¾Ý×öͳ¼Æ¡¢·ÖÎö£¬»¹ÊÇÐèÒª½èÖú·Ö²¼Ê½¼ÆËã¿ò¼ÜµÄ²¢·¢¼ÆËãÐÔÄÜ£¬Ìá¸ß¼ÆËãЧÂÊ¡£

ËùÒÔSequoiaDBΪSpark¿ª·¢ÁËSequoiaDB for SparkµÄÁ¬½ÓÆ÷£¬ÈÃSparkÖ§³Ö´ÓSequoiaDBÖв¢·¢»ñÈ¡Êý¾Ý£¬ÔÙÍê³ÉÏàÓ¦µÄÊý¾Ý¼ÆËã¡£

¶Ô½Ó·½Ê½

SparkºÍSequoiaDB¶Ô½Ó·½Ê½±È½Ï¼òµ¥£¬Óû§Ö»Òª½«SequoiaDB for Spark Á¬½ÓÆ÷spark-sequoiadb.jar ºÍSequoiaDBµÄjava Çý¶¯sequoiadb.jar ¼ÓÈ뵽ÿ¸öSpark WorkerµÄCLASSPATH Öм´¿É¡£

ÀýÈ磬Óû§Ï£ÍûSparkSQL¶Ô½Óµ½SequoiaDB£¬¿ÉÒÔΪspark-env.sh ÅäÖÃÎļþÖÐÔö¼ÓSPARK_CLASSPATH²ÎÊý£¬Èç¹û¸Ã²ÎÊýÒѾ­´æÔÚ£¬Ôò½«ÐÂjar °üÌí¼Óµ½SPARK_CLASSPATH ²ÎÊýÉÏ£¬È磺

SPARK_CLASSPATH="/media/psf/mnt/sequoiadb-driver-2.9.0-SNAPSHOT.jar:/media/psf/mnt/spark-sequoiadb_2.11-2.9.0-SNAPSHOT.jar"

Óû§ÐÞ¸ÄÍêspark-env.sh ÅäÖúó£¬ÖØÆôspark-sql »òÕß thriftserver ¾ÍÍê³ÉÁËSparkºÍSequoiaDBµÄ¶Ô½Ó¡£

SparkSQL+SequoiaDBÐÔÄÜÓÅ»¯

Spark SQL+SequoiaDBµÄÐÔÄÜÓÅ»¯½«»á´Óconnector ¼ÆËã¼¼ÊõÔ­Àí¡¢SparkSQLÓÅ»¯¡¢SequoiaDBÓÅ»¯ºÍconnector²ÎÊýÓÅ»¯4¸ö·½Ãæ½øÐнéÉÜ¡£

SequoiaDB for SparkSQL connector½éÉÜ

1£©connector¹¤×÷Ô­Àí

Spark²úÆ·ËäȻΪÓû§ÌṩÁ˶àÖÖ¹¦ÄÜÄ£¿é£¬µ«ÊǶ¼Ö»ÊÇÊý¾Ý¼ÆËãµÄ¹¦ÄÜÄ£¿é¡£Spark²úÆ·±¾ÉíûÓÐÈκεĴ洢¹¦ÄÜ£¬ÔÚĬÈÏÇé¿öÏ£¬SparkÊÇ´Ó±¾µØÎļþ·þÎñÆ÷»òÕßHDFSÉ϶ÁÈ¡Êý¾Ý¡£¶øSparkÒ²½«ËüÓë´æ´¢²ãµÄ½Ó¿Ú¿ª·Å¸ø¹ã´ó¿ª·¢Õߣ¬¿ª·¢ÕßÖ»Òª°´ÕÕSpark½Ó¿Ú¹æ·¶ÊµÏÖÆä´æ´¢²ãÁ¬½ÓÆ÷£¬ÈκÎÊý¾ÝÔ´¾ù¿É³ÆÎªSpark¼ÆËãµÄÊý¾ÝÀ´Ô´¡£

ÏÂͼΪSpark workerÓë´æ´¢²ãÖÐdatanodeµÄ¹ØÏµ¡£

Spark¼ÆËã¿ò¼ÜÓë´æ´¢²ãµÄ¹ØÏµ£¬¿ÉÒÔ´ÓÏÂͼÖÐÁ˽âÆäÔ­Àí¡£

Spark masterÔÚ½ÓÊÕµ½Ò»¸ö¼ÆËãÈÎÎñºó£¬Ê×ÏÈ»áÓë´æ´¢²ã×öÒ»´ÎͨѶ£¬´Ó´æ´¢²ãµÄ·ÃÎÊ¿ìÕÕ»òÕßÊÇ´æ´¢¹æ»®ÖУ¬µÃµ½±¾´Î¼ÆËãÈÎÎñËùÉè¼ÆµÄËùÓÐÊý¾ÝµÄ´æ´¢Çé¿ö¡£´æ´¢²ã·µ»Ø¸øSpark masterµÄ½á¹ûΪÊý¾Ý´æ´¢µÄpartition¶ÓÁС£

È»ºóSpark master»á½«Êý¾Ý´æ´¢µÄpartition¶ÓÁÐÖеÄpartitionÖð¸ö·ÖÅ䏸¸øSpark worker¡£Spark workÔÚ½ÓÊÕµ½Êý¾ÝµÄpartitionÐÅÏ¢ºó£¬¾ÍÄܹ»Á˽âÈçºÎ»ñÈ¡¼ÆËãÊý¾Ý¡£È»ºóSpark work»áÖ÷¶¯Óë´æ´¢²ãµÄnode½Úµã½øÐÐÁ¬½Ó£¬»ñÈ¡Êý¾Ý£¬ÔÙ½áºÏSpark masterÏ·¢¸øSpark workerµÄ¼ÆËãÈÎÎñ£¬¿ªÊ¼Êý¾Ý¼ÆË㹤×÷¡£

SequoiaDB for SparkµÄÁ¬½ÓÆ÷µÄʵÏÖÔ­ÀíºÍÉÏÊöÃèÊö»ù±¾Ò»Ö£¬Ö»ÊÇÔÚÉú³ÉÊý¾Ý¼ÆËãµÄpartitionÈÎÎñʱ£¬Á¬½ÓÆ÷»á¸ù¾ÝSparkÏÂѹµÄ²éѯÌõ¼þµ½SequoiaDBÖÐÉú³É²éѯ¼Æ»®¡£

Èç¹ûSequoiaDBÄܹ»¸ù¾Ý²éѯÌõ¼þ×öË÷ÒýɨÃ裬Á¬½ÓÆ÷Éú³ÉµÄpartitionÈÎÎñ½«ÊÇÈÃSpark workÖ±½ÓÁ¬½ÓSequoiaDBµÄÊý¾Ý½Úµã¡£

Èç¹ûSequoiaDBÎÞ·¨¸ù¾Ý²éѯÌõ¼þ×öË÷ÒýɨÃ裬Á¬½ÓÆ÷½«»ñÈ¡Ïà¹ØÊý¾Ý±íµÄËùÓÐÊý¾Ý¿éÐÅÏ¢£¬È»ºó¸ù¾ÝpartitionblocknumºÍpartitionmaxnum²ÎÊýÉú³É°üº¬Èô¸É¸öÊý¾Ý¿éÁ¬½ÓÐÅÏ¢µÄpartititon¼ÆËãÈÎÎñ¡£

2£©connector²ÎÊý˵Ã÷

SequoiaDB for Spark Á¬½ÓÆ÷ÔÚSequoiaDB 2.10Ö®ºó½øÐÐÁËÖØ¹¹£¬Ìá¸ßSpark²¢·¢´ÓSequoiaDB»ñÈ¡Êý¾ÝµÄÐÔÄÜ£¬²ÎÊýÒ²ÓÐÏàÓ¦µÄµ÷Õû¡£

Óû§ÔÚSparkSQLÉÏ´´½¨Êý¾ÝԴΪSequoiaDBµÄtable£¬½¨±íÄ£°æÈçÏ£º

create [temporary] <table|view> <name>[(schema)] using com.sequoiadb.spark options (<options>);

SparkSQL´´±íÃüÁîµÄ¹Ø¼ü×Ö½éÉÜ£º

1. temporary ¹Ø¼ü×Ö£¬´ú±í¸Ã±í»òÕßÊÓͼÊÇ·ñΪÁÚʱ´´½¨µÄ£¬Èç¹ûÓû§±ê¼ÇÁËtemporary ¹Ø¼ü×Ö£¬Ôò¸Ã±í»òÕßÊÓͼÔÚ¿Í»§¶ËÖØÆôºó½«×Ô¶¯±»É¾³ý£»

2. ½¨±íʱÓû§¿ÉÒÔÑ¡Ôñ²»Ö¸¶¨±í½á¹¹£¬ÒòΪÈç¹ûÓû§²»ÏÔʽָ¶¨±í½á¹¹£¬SparkSQL½«ÔÚ½¨±íʱ×Ô¶¯¼ì²âÒѾ­´æÔÚÊý¾ÝµÄ±í½á¹¹£»

3. com.sequoiadb.spark ¹Ø¼ü×ÖΪSequoiaDB for Spark connector µÄÈë¿ÚÀࣻ

4. options ΪSequoiaDB for Spark connectorµÄÅäÖòÎÊý£»

SparkSQL½¨±íÀý×ÓÈçÏ£º

create table tableName (name string, id int) using com.sequoiadb.spark options (host 'sdb1:11810,sdb2:11810,sdb3:11810', collectionspace 'foo', collection 'bar', username 'sdbadmin', password 'sdbadmin');

SparkSQL for SequoiaDBµÄ½¨±íoptions²ÎÊýÁбíÈçÏ£º

SparkSQLÓÅ»¯

Óû§Èç¹ûҪʹÓÃSparkSQL¶Ôº£Á¿Êý¾Ý×öͳ¼Æ·ÖÎö²Ù×÷£¬ÄÇôӦ¸Ã´Ó3¸ö·½Ãæ½øÐÐÐÔÄܵ÷ÓÅ

1. µ÷´óSpark Worker ×î´ó¿ÉÓÃÄÚ´æ´óС£¬·ÀÖ¹ÔÚ¼ÆËã¹ý³ÌÖÐÊý¾Ý³¬³öÄڴ淶Χ£¬ÐèÒª½«²¿·ÖÊý¾ÝдÈëµ½ÁÙʱÎļþÉÏ£»

2. Ôö¼ÓSpark Worker ÊýÄ¿£¬²¢ÇÒÉèÖÃÿ¸öWorker¾ù¿ÉÒÔʹÓõ±Ç°·þÎñÆ÷×óÓÒCPU×ÊÔ´£¬ÒÔÌá¸ß²¢·¢ÄÜÁ¦£»

3. µ÷ÕûSparkµÄÔËÐвÎÊý£»

Óû§¿ÉÒÔ¶Ôspark-env.sh ÅäÖÃÎļþ½øÐÐÉèÖã¬SPARK_WORKER_MEMORYΪ¿ØÖÆWorker¿ÉÓÃÄÚ´æµÄ²ÎÊý£¬SPARK_WORKER_INSTANCESΪÿ̨·þÎñÆ÷Æô¶¯¶àÉÙ¸öWorkerµÄ²ÎÊý¡£

Èç¹ûÓû§ÐèÒªµ÷ÕûSparkµÄÔËÐвÎÊý£¬ÔòÓ¦¸ÃÐÞ¸Äspark-defaults.conf ÅäÖÃÎļþ£¬¶ÔÓÅ»¯º£Á¿Êý¾Ýͳ¼Æ¼ÆËãÓнÏÃ÷ÏÔÌáÉýµÄ²ÎÊýÓÐ

1) spark.storage.memoryFraction£¬¸Ã²ÎÊý¿ØÖÆWorker¶àÉÙÄÚ´æ±ÈÀýÓû§´æ´¢ÁÙʱ¼ÆËãÊý¾Ý£¬Ä¬ÈÏΪ0.6£¬´ú±í60%µÄº¬Ò壻

2) spark.shuffle.memoryFraction£¬¸Ã²ÎÊý¿ØÖƼÆËã¹ý³ÌÖÐshuffleʱÄܹ»Õ¼ÓÃÿ¸öWorkerµÄÄÚ´æ±ÈÀý£¬Ä¬ÈÏΪ0.2£¬´ú±í20%µÄº¬Ò壬Èç¹ûÁÙʱ´æ´¢µÄ¼ÆËãÊý¾Ý½ÏÉÙ£¬¶ø¼ÆËãÖÐÓн϶àµÄgroup by¡¢sort¡¢joinµÈ²Ù×÷£¬Ó¦¸Ã¿¼Âǽ«spark.shuffle.memoryFraction µ÷´ó£¬spark.storage.memoryFractionµ÷С£¬±ÜÃⳬ³öÄڴ沿·ÖÐèҪдÈëÁÙʱÎļþÖУ»

3) spark.serializer£¬¸Ã²ÎÊýÉèÖÃSparkÔÚÔËÐÐʱʹÓÃÄÄÖÖÐòÁл¯·½·¨£¬Ä¬ÈÏΪorg.apache.spark.serializer.JavaSerializer£¬µ«ÊÇΪÁËÌáÉýÐÔÄÜ£¬Ó¦¸ÃÑ¡Ôñorg.apache.spark.serializer.KryoSerializer ÐòÁл¯

SequoiaDBÓÅ»¯

SparkSQL+SequoiaDBÕâÖÖ×éºÏ£¬ÓÉÓÚÊý¾Ý¶ÁÈ¡ÊÇ´ÓSequoiaDBÖнøÐУ¬ËùÒÔÔÚÐÔÄÜÓÅ»¯Ó¦¸Ã¿¼ÂÇÈýµã

1. ¾¡¿ÉÄܽ«´ó±íµÄÊý¾Ý·Ö²¼Ê½´æ´¢£¬ËùÒÔ½¨Òé·ûºÏ¶þάÇзÖÌõ¼þµÄtableÓ¦¸Ã²ÉÓöàά+HashÇзÖÁ½ÖÖÊý¾Ý¾ùºâ·½Ê½½øÐÐÊý¾Ý·Ö²¼Ê½´æ´¢£»

2. Êý¾Ýµ¼Èëʱ£¬Ó¦¸Ã±ÜÃâͬʱ¶ÔÏàͬ¼¯ºÏ¿Õ¼äµÄ¶à¸ö¼¯ºÏ×öÊý¾Ýµ¼È룬ÒòΪͬһ¸ö¼¯ºÏ¿Õ¼äϵĶà¸ö¼¯ºÏÊǹ²ÓÃÏàͬһ¸öÊý¾ÝÎļþ£¬Èç¹ûͬʱÏòÏàͬ¼¯ºÏ¿Õ¼äµÄ¶à¸ö¼¯ºÏ×öÊý¾Ýµ¼È룬»áµ¼ÖÂÿ¸ö¼¯ºÏϵÄÊý¾Ý¿é´æ´¢¹ýÓÚÀëÉ¢£¬´Ó¶øµ¼ÖÂÔÚSpark SQL´ÓSequoiaDB»ñÈ¡º£Á¿Êý¾Ýʱ£¬ÐèÒª¶ÁÈ¡µÄÊý¾Ý¿é¹ý¶à£»

3. Èç¹ûSparkSQLµÄ²éѯÃüÁîÖаüº¬²éѯÌõ¼þ£¬Ó¦¸Ã¶ÔÓ¦µØÔÚSequoiaDBÖн¨Á¢¶ÔÓ¦×ֶεÄË÷Òý£»

connectorÓÅ»¯

SequoiaDB for Spark Á¬½ÓÆ÷µÄ²ÎÊýÓÅ»¯£¬Ö÷Òª·ÖÁ½¸ö³¡¾°£¬Ò»ÊÇÊý¾Ý¶Á£¬ÁíÍâÒ»¸öÊÇÊý¾ÝдÈë¡£

Êý¾ÝдÈëµÄÓÅ»¯¿Õ¼ä½ÏÉÙ£¬Ö»ÓÐÒ»¸ö²ÎÊý¿ÉÒÔµ÷Õû£¬¼´bulksize²ÎÊý£¬¸Ã²ÎÊýĬÈÏֵΪ500£¬´ú±íÁ¬½ÓÆ÷ÏòSequoiaDBдÈëÊý¾Ýʱ£¬ÒÔ500Ìõ¼Ç¼×é³ÉÒ»¸öÍøÂç°ü£¬ÔÙÏòSequoiaDB·¢ËÍдÈëÇëÇó£¬Í¨³£ÉèÖÃbulksize²ÎÊý£¬ÒÔÒ»¸öÍøÂç°ü²»³¬¹ý2MBΪ׼¡£

Êý¾Ý¶ÁÈ¡µÄ²ÎÊýÓÅ»¯£¬Óû§ÔòÐèÒª¹Ø×¢partitionmode¡¢partitionblocknumºÍpartitionmaxnumÈý¸ö²ÎÊý¡£

partitionmode£¬Á¬½ÓÆ÷µÄ·ÖÇøÄ£Ê½£¬¿ÉѡֵÓÐsingle¡¢sharding¡¢datablock¡¢auto£¬Ä¬ÈÏֵΪauto£¬´ú±íÁ¬½ÓÆ÷ÖÇÄÜʶ±ð¡£

1. singleÖµ´ú±íSparkSQLÔÚ·ÃÎÊSequoiaDBÊý¾Ýʱ£¬²»¿¼ÂDz¢·¢ÐÔÄÜ£¬Ö»ÓÃÒ»¸öÏß³ÌÁ¬½ÓSequoiaDBµÄCoord½Úµã£¬Ò»°ã¸Ã²ÎÊýÔÚ½¨±í×ö±í½á¹¹Êý¾Ý³éÑùʱ²ÉÓã»

2. shardingÖµ´ú±íSparkSQL·ÃÎÊSequoiaDBÊý¾Ýʱ£¬²ÉÓÃÖ±½ÓÁ¬½ÓSequoiaDB¸÷¸ödatanodeµÄ·½Ê½£¬¸Ã²ÎÊýÒ»°ã²ÉÓÃÔÚSQLÃüÁî°üº¬²éѯÌõ¼þ£¬²¢ÇҸòéѯ¿ÉÒÔÔÚSequoiaDBÖÐʹÓÃË÷Òý²éѯµÄ³¡¾°£»

3. datablockÖµ´ú±íSparkSQL·ÃÎÊSequoiaDBÊý¾Ýʱ£¬²ÉÓò¢·¢Á¬½ÓSequoiaDBµÄÊý¾Ý¿é½øÐÐÊý¾Ý¶ÁÈ¡£¬¸Ã²ÎÊýÒ»°ãʹÓÃÔÚSQLÃüÁîÎÞ·¨ÔÚSequoiaDBÖÐʹÓÃË÷Òý²éѯ£¬²¢ÇÒ²éѯµÄÊý¾ÝÁ¿½Ï´óµÄ³¡¾°£»

4. autoÖµ´ú±íSparkSQLÔÚÏòSequoiaDB²éѯÊý¾Ýʱ£¬·ÃÎÊSequoiaDBµÄ·½Ê½½«ÓÉÁ¬½ÓÆ÷¸ù¾Ý²»Í¬µÄÇé¿ö·ÖÎö¾ö¶¨£»

partitionblocknum£¬¸Ã²ÎÊýÖ»ÓÐÔÚpartitionmode=datablockʱ²Å»áÉúЧ£¬´ú±íÿ¸öWorkerÔÚ×öÊý¾Ý¼ÆËãʱ£¬Ò»´Î»ñÈ¡¶àÉÙ¸öSequoiaDBÊý¾Ý¿é¶ÁÈ¡ÈÎÎñ£¬¸Ã²ÎÊýĬÈÏֵΪ4¡£Èç¹ûSequoiaDBÖд洢µÄÊý¾ÝÁ¿½Ï´ó£¬¼ÆËãÊ±Éæ¼°µ½µÄÊý¾Ý¿é½Ï¶à£¬Óû§Ó¦¸Ãµ÷´ó¸Ã²ÎÊý£¬Ê¹µÃSparkSQLµÄ¼ÆËãÈÎÎñ±£³ÖÔÚÒ»¸öºÏÀí·¶Î§£¬Ìá¸ßÊý¾Ý¶ÁȡЧÂÊ¡£

partitionmaxnum£¬¸Ã²ÎÊýÖ»ÓÐÔÚpartitionmode=datablockʱ²Å»áÉúЧ£¬´ú±íÁ¬½ÓÆ÷×î¶àÄܹ»Éú³É¶àÉÙ¸öÊý¾Ý¿é¶ÁÈ¡ÈÎÎñ£¬¸Ã²ÎÊýµÄĬÈÏֵΪ1000¡£¸Ã²ÎÊýÖ÷ÒªÊÇΪÁ˱ÜÃâÓÉÓÚSequoiaDBÖеÄÊý¾ÝÁ¿¹ý´ó£¬µ¼ÖÂ×ܵÄÊý¾Ý¿éÊýÁ¿Ì«´ó£¬´Ó¶øµ¼ÖÂSparkSQLµÄ¼ÆËãÈÎÎñ¹ý¶à£¬¶øµ¼ÖÂ×ÜÌ弯ËãÐÔÄÜϽµ¡£

×ܽá

±¾ÎÄ´ÓSpark¡¢SequoiaDBÒÔ¼°SequoiaDB for Spark connectorÈý¸ö·½ÃæÏò¶ÁÕßÃǽéÉÜÁ˺£Á¿Êý¾ÝÏÂʹÓÃSparkSQL+SequoiaDBµÄÐÔÄܵ÷ÓÅ·½·¨¡£

ÎÄÕÂÖнéÉܵķ½·¨¾ßÓÐÒ»¶¨µÄ²Î¿¼ÒâÒ壬µ«ÊÇÐÔÄܵ÷ÓÅÒ»Ö±¶¼ÊÇ×Ñé¼¼ÊõÈËÔ±µÄ¹¤×÷¡£¼¼ÊõÈËÔ±ÔÚ¶Ô·Ö²¼Ê½»·¾³×öÐÔÄܵ÷ÓÅʱ£¬ÐèÒª×ۺϿ¼ÂǶà¸ö·½ÃæµÄÊý¾Ý£¬ÀýÈ磺·þÎñÆ÷µÄÓ²¼þ×ÊԴʹÓÃÇé¿ö¡¢SparkÔËÐÐ×´¿ö¡¢SequoiaDBÊý¾Ý·Ö²¼ÊÇ·ñºÏÀí¡¢Á¬»úÆ÷µÄ²ÎÊýÉèÖÃÊÇ·ñÕýÈ·¡¢SQLÃüÁîÊÇ·ñÓе÷ÓŵĿռäµÈ£¬ÒªÏëÐÔÄÜÌáÉý£¬ÖصãÊÇÒªÇó¼¼ÊõÈËÔ±ÕÒµ½Õû¸öϵͳÖеÄÐÔÄ̰ܶ壬Ȼºóͨ¹ýµ÷Õû²»Í¬µÄ²ÎÊý»òÕßÐ޸Ĵ洢·½°¸£¬´Ó¶øÈÃϵͳÔËÐеøü¼Ó¸ßЧ¡£

 
   
1502 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]
 
×îÐÂÎÄÕÂ
´óÊý¾Ýƽ̨ϵÄÊý¾ÝÖÎÀí
ÈçºÎÉè¼ÆÊµÊ±Êý¾Ýƽ̨£¨¼¼Êõƪ£©
´óÊý¾Ý×ʲú¹ÜÀí×ÜÌå¿ò¼Ü¸ÅÊö
Kafka¼Ü¹¹ºÍÔ­Àí
ELK¶àÖּܹ¹¼°ÓÅÁÓ
×îпγÌ
´óÊý¾Ýƽ̨´î½¨Óë¸ßÐÔÄܼÆËã
´óÊý¾Ýƽ̨¼Ü¹¹ÓëÓ¦ÓÃʵս
´óÊý¾ÝϵͳÔËά
´óÊý¾Ý·ÖÎöÓë¹ÜÀí
Python¼°Êý¾Ý·ÖÎö
³É¹¦°¸Àý
ijͨÐÅÉ豸ÆóÒµ PythonÊý¾Ý·ÖÎöÓëÍÚ¾ò
Ä³ÒøÐÐ È˹¤ÖÇÄÜ+Python+´óÊý¾Ý
±±¾© Python¼°Êý¾Ý·ÖÎö
ÉñÁúÆû³µ ´óÊý¾Ý¼¼Êõƽ̨-Hadoop
ÖйúµçÐÅ ´óÊý¾Ýʱ´úÓëÏÖ´úÆóÒµµÄÊý¾Ý»¯ÔËӪʵ¼ù