java.lang.IllegalStateException: java.io.IOException: Could not get input splits

GitHub | pluradj | 7 months ago
tip
Do you know that we can give you better hits? Get more relevant results from Samebug’s stack trace search.
  1. 0

    GitHub comment 693#236010483

    GitHub | 7 months ago | pluradj
    java.lang.IllegalStateException: java.io.IOException: Could not get input splits
  2. 0

    2 DC cluster: {code} $ nodetool status Datacenter: SearchAnalytics =========================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 172.31.13.54 139.22 MB 1 ? 46998d2d-35b4-42eb-829c-f84fe618c9a1 rack1 Datacenter: Solr ================ Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 172.31.14.43 178.99 MB 1 ? 8aa86c2d-3cf3-4a19-bfe6-9b834b347325 rack1 UN 172.31.2.3 39.08 MB 1 ? 3939212c-0588-4eee-8262-774f5c52c517 rack1 {code} {code}cqlsh> desc KEYSPACE test CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'Solr': '1'} AND durable_writes = true; CREATE TABLE test.albums ( album_uid int PRIMARY KEY, album_name text, artists int, comment text, composer int, cooperative int, disc_num int, duration int, label_name text, track_name text, track_num int, track_uid int, year int ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';{code} {code} ~$ dse spark --conf spark.cassandra.output.consistency.level=ONE spark.cassandra.input.consistency.level=ONE {code} hc errors out: {code}scala> hc.sql("select * from test.albums").count() WARN 2016-01-19 21:26:30 org.apache.hadoop.hive.cassandra.CassandraManager: Default CL is LOCAL_ONE. Because of replication factor of local data center is less than 1, set CL to ONE WARN 2016-01-19 21:26:30 org.apache.spark.scheduler.DAGScheduler: Creating new stage failed due to exception - job: 0 java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:203) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:331) ~[hive-0.13.1-cassandra-connector-0.2.9.jar:0.2.9] at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:269) ~[hive-0.13.1-cassandra-connector-0.2.9.jar:0.2.9] at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:78) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:321) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:333) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.getParentStagesAndId(DAGScheduler.scala:234) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.newResultStage(DAGScheduler.scala:270) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:768) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1429) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1421) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[na:1.8.0_40] at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[na:1.8.0_40] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:199) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] ... 41 common frames omitted Caused by: java.io.IOException: failed connecting to all endpoints at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_40] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_40] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_40] at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_40] java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:203) at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:331) at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:269) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82) at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:78) at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206) at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204) at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:321) at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:333) at org.apache.spark.scheduler.DAGScheduler.getParentStagesAndId(DAGScheduler.scala:234) at org.apache.spark.scheduler.DAGScheduler.newResultStage(DAGScheduler.scala:270) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:768) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1429) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1421) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:199) ... 41 more Caused by: java.io.IOException: failed connecting to all endpoints at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} sc hangs: {code} scala> sc.cassandraTable("test", "albums").count() [Stage 0:> (0 + 1) / 3] {code} Though it shows similar errors to the ones in HC on stderr.

    DataStax JIRA | 1 year ago | Sebasitan Estevez
    java.io.IOException: Could not get input splits
  3. 0

    2 DC cluster: {code} $ nodetool status Datacenter: SearchAnalytics =========================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 172.31.13.54 139.22 MB 1 ? 46998d2d-35b4-42eb-829c-f84fe618c9a1 rack1 Datacenter: Solr ================ Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 172.31.14.43 178.99 MB 1 ? 8aa86c2d-3cf3-4a19-bfe6-9b834b347325 rack1 UN 172.31.2.3 39.08 MB 1 ? 3939212c-0588-4eee-8262-774f5c52c517 rack1 {code} {code}cqlsh> desc KEYSPACE test CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'Solr': '1'} AND durable_writes = true; CREATE TABLE test.albums ( album_uid int PRIMARY KEY, album_name text, artists int, comment text, composer int, cooperative int, disc_num int, duration int, label_name text, track_name text, track_num int, track_uid int, year int ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';{code} {code} ~$ dse spark --conf spark.cassandra.output.consistency.level=ONE spark.cassandra.input.consistency.level=ONE {code} hc errors out: {code}scala> hc.sql("select * from test.albums").count() WARN 2016-01-19 21:26:30 org.apache.hadoop.hive.cassandra.CassandraManager: Default CL is LOCAL_ONE. Because of replication factor of local data center is less than 1, set CL to ONE WARN 2016-01-19 21:26:30 org.apache.spark.scheduler.DAGScheduler: Creating new stage failed due to exception - job: 0 java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:203) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:331) ~[hive-0.13.1-cassandra-connector-0.2.9.jar:0.2.9] at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:269) ~[hive-0.13.1-cassandra-connector-0.2.9.jar:0.2.9] at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:78) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.5.jar:na] at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:321) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:333) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.getParentStagesAndId(DAGScheduler.scala:234) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.newResultStage(DAGScheduler.scala:270) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:768) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1429) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1421) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) [spark-core_2.10-1.4.2.2.jar:1.4.2.2] Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[na:1.8.0_40] at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[na:1.8.0_40] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:199) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] ... 41 common frames omitted Caused by: java.io.IOException: failed connecting to all endpoints at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221) ~[cassandra-all-2.1.11.969.jar:2.1.11-SNAPSHOT] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_40] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_40] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_40] at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_40] java.io.IOException: Could not get input splits at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:203) at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:331) at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:269) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82) at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:78) at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206) at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204) at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:321) at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:333) at org.apache.spark.scheduler.DAGScheduler.getParentStagesAndId(DAGScheduler.scala:234) at org.apache.spark.scheduler.DAGScheduler.newResultStage(DAGScheduler.scala:270) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:768) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1429) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1421) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:199) ... 41 more Caused by: java.io.IOException: failed connecting to all endpoints at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} sc hangs: {code} scala> sc.cassandraTable("test", "albums").count() [Stage 0:> (0 + 1) / 3] {code} Though it shows similar errors to the ones in HC on stderr.

    DataStax JIRA | 1 year ago | Sebasitan Estevez
    java.io.IOException: Could not get input splits
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    Titan 0.5.4: How to index/reindex data uploaded through Titan-Hadoop

    Stack Overflow | 1 year ago | Ruslan Mavlyutov
    java.io.IOException: Could not get input splits

    Root Cause Analysis

    1. java.io.IOException

      failed connecting to all endpoints

      at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits()
    2. org.apache.cassandra
      AbstractColumnFamilyInputFormat$SplitCallable.call
      1. org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:317)
      2. org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61)
      3. org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236)
      4. org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221)
      4 frames
    3. Java RT
      Thread.run
      1. java.util.concurrent.FutureTask.run(FutureTask.java:266)
      2. java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      3. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      4. java.lang.Thread.run(Thread.java:745)
      4 frames