Tabnine Logo
PCollection.aggregate
Code IndexAdd Tabnine to your IDE (free)

How to use
aggregate
method
in
org.apache.crunch.PCollection

Best Java code snippets using org.apache.crunch.PCollection.aggregate (Showing top 1 results out of 315)

origin: apache/crunch

public int run(String[] args) throws Exception {
 if (args.length != 1) {
  System.err.println();
  System.err.println("Usage: " + this.getClass().getName() + " [generic options] input");
  System.err.println();
  GenericOptionsParser.printGenericCommandUsage(System.err);
  return 1;
 }
 // Create an object to coordinate pipeline creation and execution.
 Pipeline pipeline = new MRPipeline(TotalWordCount.class, getConf());
 // Reference a given text file as a collection of Strings.
 PCollection<String> lines = pipeline.readTextFile(args[0]);
 // Define a function that splits each line in a PCollection of Strings into
 // a
 // PCollection made up of the individual words in the file.
 PCollection<Long> numberOfWords = lines.parallelDo(new DoFn<String, Long>() {
  public void process(String line, Emitter<Long> emitter) {
   emitter.emit((long)line.split("\\s+").length);
  }
 }, Writables.longs()); // Indicates the serialization format
 // The aggregate method groups a collection into a single PObject.
 PObject<Long> totalCount = numberOfWords.aggregate(Aggregators.SUM_LONGS()).first();
 // Execute the pipeline as a MapReduce.
 PipelineResult result = pipeline.run();
 System.out.println("Total number of words: " + totalCount.getValue());
 
 pipeline.done();
 return result.succeeded() ? 0 : 1;
}
org.apache.crunchPCollectionaggregate

Javadoc

Returns a PCollection that contains the result of aggregating all values in this instance.

Popular methods of PCollection

  • parallelDo
    Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the
  • getPType
    Returns the PType of this PCollection.
  • by
    Apply the given map function to each element of this instance in order to create a PTable.
  • write
    Write the contents of this PCollection to the given Target, using the given Target.WriteMode to hand
  • materialize
    Returns a reference to the data set represented by this PCollection that may be used by the client t
  • getPipeline
    Returns the Pipeline associated with this PCollection.
  • getTypeFamily
    Returns the PTypeFamily of this PCollection.
  • count
    Returns a PTable instance that contains the counts of each unique element of this PCollection.
  • asReadable
  • cache
    Marks this data as cached using the given CachingOptions. Cached PCollections will only be processed
  • filter
    Apply the given filter function to this instance and return the resulting PCollection.
  • first
  • filter,
  • first,
  • getName,
  • getSize,
  • union

Popular in Java

  • Start an intent from android
  • getSystemService (Context)
  • runOnUiThread (Activity)
  • getOriginalFilename (MultipartFile)
    Return the original filename in the client's filesystem.This may contain path information depending
  • DateFormat (java.text)
    Formats or parses dates and times.This class provides factories for obtaining instances configured f
  • Date (java.util)
    A specific moment in time, with millisecond precision. Values typically come from System#currentTime
  • ResourceBundle (java.util)
    ResourceBundle is an abstract class which is the superclass of classes which provide Locale-specifi
  • Vector (java.util)
    Vector is an implementation of List, backed by an array and synchronized. All optional operations in
  • Modifier (javassist)
    The Modifier class provides static methods and constants to decode class and member access modifiers
  • StringUtils (org.apache.commons.lang)
    Operations on java.lang.String that arenull safe. * IsEmpty/IsBlank - checks if a String contains
  • Best plugins for Eclipse
Tabnine Logo
  • Products

    Search for Java codeSearch for JavaScript code
  • IDE Plugins

    IntelliJ IDEAWebStormVisual StudioAndroid StudioEclipseVisual Studio CodePyCharmSublime TextPhpStormVimGoLandRubyMineEmacsJupyter NotebookJupyter LabRiderDataGripAppCode
  • Company

    About UsContact UsCareers
  • Resources

    FAQBlogTabnine AcademyTerms of usePrivacy policyJava Code IndexJavascript Code Index
Get Tabnine for your IDE now