java.io.IOException: Could not get input splits

DataStax JIRA | Sebasitan Estevez | 11 months ago
  1. 0

    2 DC cluster: {code} $ nodetool status Datacenter: SearchAnalytics =========================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 172.31.13.54 139.22 MB 1 ? 46998d2d-35b4-42eb-829c-f84fe618c9a1 rack1 Datacenter: Solr ================ Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 172.31.14.43 178.99 MB 1 ? 8aa86c2d-3cf3-4a19-bfe6-9b834b347325 rack1 UN 172.31.2.3 39.08 MB 1 ? 3939212c-0588-4eee-8262-774f5c52c517 rack1 {code} {code}cqlsh> desc KEYSPACE test CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'Solr': '1'} AND durable_writes = true; CREATE TABLE test.albums ( album_uid int PRIMARY KEY, album_name text, artists int, comment text, composer int, cooperative int, disc_num int, duration int, label_name text, track_name text, track_num int, track_uid int, year int ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';{code} {code} ~$ dse spark --conf spark.cassandra.output.consistency.level=ONE spark.cassandra.input.consistency.level=ONE {code} hc errors out: {code}scala> hc.sql("select * from test.albums").count() WARN 2016-01-19 21:26:30 org.apache.hadoop.hive.cassandra.CassandraManager: Default CL is LOCAL_ONE. Because of replication factor of local data center is less than 1, set CL to ONE WARN 2016-01-19 21:26:30 org.apache.spark.scheduler.DAGScheduler: Creating new stage failed due to exception - job: 0 java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:203) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:331) ~[hive-0.13.1-cassandra-connector-0.2.9.jar:0.2.9] at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:269) ~[hive-0.13.1-cassandra-connector-0.2.9.jar:0.2.9] at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:78) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:321) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:333) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.getParentStagesAndId(DAGScheduler.scala:234) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.newResultStage(DAGScheduler.scala:270) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:768) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1429) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1421) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[na:1.8.0_40] at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[na:1.8.0_40] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:199) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] ... 41 common frames omitted Caused by: java.io.IOException: failed connecting to all endpoints at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_40] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_40] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_40] at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_40] java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:203) at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:331) at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:269) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82) at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:78) at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206) at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204) at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:321) at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:333) at org.apache.spark.scheduler.DAGScheduler.getParentStagesAndId(DAGScheduler.scala:234) at org.apache.spark.scheduler.DAGScheduler.newResultStage(DAGScheduler.scala:270) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:768) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1429) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1421) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:199) ... 41 more Caused by: java.io.IOException: failed connecting to all endpoints at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} sc hangs: {code} scala> sc.cassandraTable("test", "albums").count() [Stage 0:> (0 + 1) / 3] {code} Though it shows similar errors to the ones in HC on stderr.

    DataStax JIRA | 11 months ago | Sebasitan Estevez
    java.io.IOException: Could not get input splits
  2. 0

    2 DC cluster: {code} $ nodetool status Datacenter: SearchAnalytics =========================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 172.31.13.54 139.22 MB 1 ? 46998d2d-35b4-42eb-829c-f84fe618c9a1 rack1 Datacenter: Solr ================ Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 172.31.14.43 178.99 MB 1 ? 8aa86c2d-3cf3-4a19-bfe6-9b834b347325 rack1 UN 172.31.2.3 39.08 MB 1 ? 3939212c-0588-4eee-8262-774f5c52c517 rack1 {code} {code}cqlsh> desc KEYSPACE test CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'Solr': '1'} AND durable_writes = true; CREATE TABLE test.albums ( album_uid int PRIMARY KEY, album_name text, artists int, comment text, composer int, cooperative int, disc_num int, duration int, label_name text, track_name text, track_num int, track_uid int, year int ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';{code} {code} ~$ dse spark --conf spark.cassandra.output.consistency.level=ONE spark.cassandra.input.consistency.level=ONE {code} hc errors out: {code}scala> hc.sql("select * from test.albums").count() WARN 2016-01-19 21:26:30 org.apache.hadoop.hive.cassandra.CassandraManager: Default CL is LOCAL_ONE. Because of replication factor of local data center is less than 1, set CL to ONE WARN 2016-01-19 21:26:30 org.apache.spark.scheduler.DAGScheduler: Creating new stage failed due to exception - job: 0 java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:203) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:331) ~[hive-0.13.1-cassandra-connector-0.2.9.jar:0.2.9] at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:269) ~[hive-0.13.1-cassandra-connector-0.2.9.jar:0.2.9] at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:78) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:321) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:333) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.getParentStagesAndId(DAGScheduler.scala:234) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.newResultStage(DAGScheduler.scala:270) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:768) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1429) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1421) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[na:1.8.0_40] at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[na:1.8.0_40] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:199) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] ... 41 common frames omitted Caused by: java.io.IOException: failed connecting to all endpoints at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_40] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_40] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_40] at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_40] java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:203) at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:331) at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:269) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82) at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:78) at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206) at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204) at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:321) at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:333) at org.apache.spark.scheduler.DAGScheduler.getParentStagesAndId(DAGScheduler.scala:234) at org.apache.spark.scheduler.DAGScheduler.newResultStage(DAGScheduler.scala:270) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:768) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1429) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1421) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:199) ... 41 more Caused by: java.io.IOException: failed connecting to all endpoints at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} sc hangs: {code} scala> sc.cassandraTable("test", "albums").count() [Stage 0:> (0 + 1) / 3] {code} Though it shows similar errors to the ones in HC on stderr.

    DataStax JIRA | 11 months ago | Sebasitan Estevez
    java.io.IOException: Could not get input splits
  3. 0

    Titan 0.5.4: How to index/reindex data uploaded through Titan-Hadoop

    Stack Overflow | 10 months ago | Ruslan Mavlyutov
    java.io.IOException: Could not get input splits
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    GitHub comment 693#236010483

    GitHub | 4 months ago | pluradj
    java.lang.IllegalStateException: java.io.IOException: Could not get input splits
  6. 0

    Cassandra Upgrade 0.8.2->0.8.4 get error "failed connecting to all endpoints"

    Stack Overflow | 5 years ago | Anton
    java.io.IOException: failed connecting to all endpoints slave1/98.188.69.242

    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. java.io.IOException

      failed connecting to all endpoints

      at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits()
    2. org.apache.cassandra
      AbstractColumnFamilyInputFormat$SplitCallable.call
      1. org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317)
      2. org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61)
      3. org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236)
      4. org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221)
      4 frames
    3. Java RT
      Thread.run
      1. java.util.concurrent.FutureTask.run(FutureTask.java:266)
      2. java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      3. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      4. java.lang.Thread.run(Thread.java:745)
      4 frames