J'ai essayé d'activer le cbo Spark en fixant la propriété dans spark-shell spark.conf.set("spark.sql.cbo.enabled", true)
Je suis en train d'exécuter spark.sql("ANALYZE TABLE events COMPUTE STATISTICS").show
L'exécution de cette requête ne me montre pas de statistiques spark.sql("select * from events where eventID=1").explain(true)
Exécution sur Spark 2.2.1
scala> spark.sql("select * from events where eventID=1").explain()
== Physical Plan ==
*Project [buyDetails.capacity#923, buyDetails.clearingNumber#924, buyDetails.leavesQty#925L, buyDetails.liquidityCode#926, buyDetails.orderID#927, buyDetails.side#928, cancelQty#929L, capacity#930, clearingNumber#931, contraClearingNumber#932, desiredLeavesQty#933L, displayPrice#934, displayQty#935L, eventID#936, eventTimestamp#937L, exchange#938, executionCodes#939, fillID#940, handlingInstructions#941, initiator#942, leavesQty#943L, nbbPrice#944, nbbQty#945L, nboPrice#946, ... 29 more fields]
+- *Filter (isnotnull(eventID#936) && (cast(eventID#936 as int) = 1))
+- *FileScan parquet default.events[buyDetails.capacity#923,buyDetails.clearingNumber#924,buyDetails.leavesQty#925L,buyDetails.liquidityCode#926,buyDetails.orderID#927,buyDetails.side#928,cancelQty#929L,capacity#930,clearingNumber#931,contraClearingNumber#932,desiredLeavesQty#933L,displayPrice#934,displayQty#935L,eventID#936,eventTimestamp#937L,exchange#938,executionCodes#939,fillID#940,handlingInstructions#941,initiator#942,leavesQty#943L,nbbPrice#944,nbbQty#945L,nboPrice#946,... 29 more fields] Batched: true, Format: Parquet, Location: InMemoryFileIndex[file:/home/asehgal/data/events], PartitionFilters: [], PushedFilters: [IsNotNull(eventID)], ReadSchema: struct<buyDetails.capacity:string,buyDetails.clearingNumber:string,buyDetails.leavesQty:bigint,bu...