How to use
allocate
method
in
org.apache.spark.sql.execution.vectorized.ColumnarBatch

Best Java code snippets using org.apache.spark.sql.execution.vectorized.ColumnarBatch.allocate (Showing top 7 results out of 315)

public AggregateHashMap(StructType schema, int capacity, double loadFactor, int maxSteps) {
 // We currently only support single key-value pair that are both longs
 assert (schema.size() == 2 && schema.fields()[0].dataType() == LongType &&
   schema.fields()[1].dataType() == LongType);
 // capacity should be a power of 2
 assert (capacity > 0 && ((capacity & (capacity - 1)) == 0));
 this.maxSteps = maxSteps;
 numBuckets = (int) (capacity / loadFactor);
 batch = ColumnarBatch.allocate(schema, MemoryMode.ON_HEAP, capacity);
 buckets = new int[numBuckets];
 Arrays.fill(buckets, -1);
}

 /**
  * Converts an iterator of rows into a single ColumnBatch.
  */
 public static ColumnarBatch toBatch(
   StructType schema, MemoryMode memMode, Iterator<Row> row) {
  ColumnarBatch batch = ColumnarBatch.allocate(schema, memMode);
  int n = 0;
  while (row.hasNext()) {
   Row r = row.next();
   for (int i = 0; i < schema.fields().length; i++) {
    appendValue(batch.column(i), schema.fields()[i].dataType(), r, i);
   }
   n++;
  }
  batch.setNumRows(n);
  return batch;
 }
}

columnarBatch = ColumnarBatch.allocate(batchSchema, memMode);
if (partitionColumns != null) {
 int partitionIdx = sparkSchema.fields().length;

public AggregateHashMap(StructType schema, int capacity, double loadFactor, int maxSteps) {
 // We currently only support single key-value pair that are both longs
 assert (schema.size() == 2 && schema.fields()[0].dataType() == LongType &&
   schema.fields()[1].dataType() == LongType);
 // capacity should be a power of 2
 assert (capacity > 0 && ((capacity & (capacity - 1)) == 0));
 this.maxSteps = maxSteps;
 numBuckets = (int) (capacity / loadFactor);
 batch = ColumnarBatch.allocate(schema, MemoryMode.ON_HEAP, capacity);
 buckets = new int[numBuckets];
 Arrays.fill(buckets, -1);
}

 /**
  * Converts an iterator of rows into a single ColumnBatch.
  */
 public static ColumnarBatch toBatch(
   StructType schema, MemoryMode memMode, Iterator<Row> row) {
  ColumnarBatch batch = ColumnarBatch.allocate(schema, memMode);
  int n = 0;
  while (row.hasNext()) {
   Row r = row.next();
   for (int i = 0; i < schema.fields().length; i++) {
    appendValue(batch.column(i), schema.fields()[i].dataType(), r, i);
   }
   n++;
  }
  batch.setNumRows(n);
  return batch;
 }
}

/**
 * Adapter class which handles the columnar vector reading of the carbondata
 * based on the spark ColumnVector and ColumnarBatch API. This proxy class
 * handles the complexity of spark 2.3 version related api changes since
 * spark ColumnVector and ColumnarBatch interfaces are still evolving.
 *
 * @param memMode       which represent the type onheap or offheap vector.
 * @param outputSchema, metadata related to current schema of table.
 * @param rowNum        rows number for vector reading
 * @param useLazyLoad   Whether to use lazy load while getting the data.
 */
public CarbonVectorProxy(MemoryMode memMode, StructType outputSchema, int rowNum,
  boolean useLazyLoad) {
 columnarBatch = ColumnarBatch.allocate(outputSchema, memMode, rowNum);
 columnVectorProxies = new ColumnVectorProxy[columnarBatch.numCols()];
 for (int i = 0; i < columnVectorProxies.length; i++) {
  if (useLazyLoad) {
   columnVectorProxies[i] =
     new ColumnVectorProxyWithLazyLoad(columnarBatch.column(i), rowNum, memMode);
  } else {
   columnVectorProxies[i] = new ColumnVectorProxy(columnarBatch.column(i), rowNum, memMode);
  }
 }
 updateColumnVectors();
}

columnarBatch = ColumnarBatch.allocate(batchSchema, memMode);
if (partitionColumns != null) {
 int partitionIdx = sparkSchema.fields().length;

Popular methods of ColumnarBatch

capacity
Returns the max capacity (in number of rows) for this batch.
close
Called to close all the columns in this batch. It is not valid to access the data after calling this
column
Returns the column at `ordinal`.
getRow
Returns the row in this batch at `rowId`. Returned row is reused across calls.
numCols
Returns the number of columns that make up this batch.
setNumRows
Sets the number of rows that are valid. Additionally, marks all rows as "filtered" if one or more of
<init>
markFiltered
Marks this row as being filtered out. This means a subsequent iteration over the rows in this batch
numRows
Returns the number of rows for read, including filtered rows.
reset
Resets the batch for writing.

Popular in Java

Updating database using SQL prepared statement
getApplicationContext (Context)
requestLocationUpdates (LocationManager)
setScale (BigDecimal)
SecureRandom (java.security)
This class generates cryptographically secure pseudo-random numbers. It is best to invoke SecureRand
SAXParseException (org.xml.sax)
Encapsulate an XML parse error or warning.> This module, both source code and documentation, is in t
VirtualMachine (com.sun.tools.attach)
A Java virtual machine. A VirtualMachine represents a Java virtual machine to which this Java vir
Menu (java.awt)
JList (javax.swing)
Reflections (org.reflections)
Reflections one-stop-shop objectReflections scans your classpath, indexes the metadata, allows you t
Best plugins for Eclipse

How to use allocatemethodin org.apache.spark.sql.execution.vectorized.ColumnarBatch

Best Java code snippets using org.apache.spark.sql.execution.vectorized.ColumnarBatch.allocate (Showing top 7 results out of 315)

How to use
allocate
method
in
org.apache.spark.sql.execution.vectorized.ColumnarBatch