Tabnine Logo
PCollection.filter
Code IndexAdd Tabnine to your IDE (free)

How to use
filter
method
in
org.apache.crunch.PCollection

Best Java code snippets using org.apache.crunch.PCollection.filter (Showing top 4 results out of 315)

origin: org.apache.crunch/crunch-hbase

/**
 * Writes out HFiles from the provided <code>cells</code> and <code>table</code>. <code>limitToAffectedRegions</code>
 * is used to indicate that the regions the <code>cells</code> will be loaded into should be identified prior to writing
 * HFiles. Identifying the regions ahead of time will reduce the number of reducers needed when writing. This is
 * beneficial if the data to be loaded only touches a small enough subset of the total regions in the table. If set to
 * false, the number of reducers will equal the number of regions in the table.
 *
 * @see <a href='https://issues.apache.org/jira/browse/CRUNCH-588'>CRUNCH-588</a>
 */
public static <C extends Cell> void writeToHFilesForIncrementalLoad(
  PCollection<C> cells,
  HTable table,
  Path outputPath,
  boolean limitToAffectedRegions) throws IOException {
 HColumnDescriptor[] families = table.getTableDescriptor().getColumnFamilies();
 if (families.length == 0) {
  LOG.warn("{} has no column families", table);
  return;
 }
 PCollection<C> partitioned = sortAndPartition(cells, table, limitToAffectedRegions);
 for (HColumnDescriptor f : families) {
  byte[] family = f.getName();
  partitioned
    .filter(new FilterByFamilyFn<C>(family))
    .write(new HFileTarget(new Path(outputPath, Bytes.toString(family)), f));
 }
}
origin: apache/crunch

cells = cells.filter(new StartRowFilterFn<C>(scan.getStartRow()));
cells = cells.filter(new StopRowFilterFn<C>(scan.getStopRow()));
cells = cells.filter(new FamilyMapFilterFn<C>(scan.getFamilyMap()));
cells = cells.filter(new TimeRangeFilterFn<C>(timeRange));
origin: org.apache.crunch/crunch-hbase

cells = cells.filter(new StartRowFilterFn<C>(scan.getStartRow()));
cells = cells.filter(new StopRowFilterFn<C>(scan.getStopRow()));
cells = cells.filter(new FamilyMapFilterFn<C>(scan.getFamilyMap()));
cells = cells.filter(new TimeRangeFilterFn<C>(timeRange));
origin: apache/crunch

hfileTarget.outputConf(RegionLocationTable.REGION_LOCATION_TABLE_PATH, regionLocationFilePath.toString());
partitioned
  .filter(new FilterByFamilyFn<C>(family))
  .write(hfileTarget);
org.apache.crunchPCollectionfilter

Javadoc

Apply the given filter function to this instance and return the resulting PCollection.

Popular methods of PCollection

  • parallelDo
    Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the
  • getPType
    Returns the PType of this PCollection.
  • by
    Apply the given map function to each element of this instance in order to create a PTable.
  • write
    Write the contents of this PCollection to the given Target, using the given Target.WriteMode to hand
  • materialize
    Returns a reference to the data set represented by this PCollection that may be used by the client t
  • getPipeline
    Returns the Pipeline associated with this PCollection.
  • getTypeFamily
    Returns the PTypeFamily of this PCollection.
  • count
    Returns a PTable instance that contains the counts of each unique element of this PCollection.
  • aggregate
    Returns a PCollection that contains the result of aggregating all values in this instance.
  • asReadable
  • cache
    Marks this data as cached using the given CachingOptions. Cached PCollections will only be processed
  • first
  • cache,
  • first,
  • getName,
  • getSize,
  • union

Popular in Java

  • Reading from database using SQL prepared statement
  • scheduleAtFixedRate (Timer)
  • getResourceAsStream (ClassLoader)
  • setScale (BigDecimal)
  • FileOutputStream (java.io)
    An output stream that writes bytes to a file. If the output file exists, it can be replaced or appen
  • Queue (java.util)
    A collection designed for holding elements prior to processing. Besides basic java.util.Collection o
  • TimeUnit (java.util.concurrent)
    A TimeUnit represents time durations at a given unit of granularity and provides utility methods to
  • Manifest (java.util.jar)
    The Manifest class is used to obtain attribute information for a JarFile and its entries.
  • Handler (java.util.logging)
    A Handler object accepts a logging request and exports the desired messages to a target, for example
  • JTable (javax.swing)
  • Top PhpStorm plugins
Tabnine Logo
  • Products

    Search for Java codeSearch for JavaScript code
  • IDE Plugins

    IntelliJ IDEAWebStormVisual StudioAndroid StudioEclipseVisual Studio CodePyCharmSublime TextPhpStormVimGoLandRubyMineEmacsJupyter NotebookJupyter LabRiderDataGripAppCode
  • Company

    About UsContact UsCareers
  • Resources

    FAQBlogTabnine AcademyTerms of usePrivacy policyJava Code IndexJavascript Code Index
Get Tabnine for your IDE now