screen-scraper CsvWriter usage

  • A simple class for writing CSVs in screen-scraper. (Must have version 4.5.18a or later)
  • Javadocs: com.screenscraper.csv.CsvWriter

    General use-case

    Setup Writer

  • Usually you just want to setup a CsvWriter, write some DataRecords and close the writer. This is the most basic way to do that.
  • First, you have a script that runs before the scraping session that creates a writer and sets a session variable.
  • //scraping session init script
    String[] header = {"Brand Name", "Product Title"};
    CsvWriter writer = new CsvWriter("output.csv");
    writer.setHeader(header);
     
    session.setVariable("WRITER", writer);
  • So to create the writer you simply passed a path to the file you want to output. (see javadocs for more constructor options)
  • You can then set the header row you want to use.
  • The header row will only get written if the file doesn't already exist.
  • The header row is also used to create a mapping between DataRecord keys and which column they represent.
  • By default the mapping values are the same as the header, except converted to all caps and underscores for spaces. (see javadocs for more mapping options)
  • So in this case, if you pass a DataRecord with the "BRAND_NAME" and "PRODUCT_TITLE" keys set, it will map to the correct csv output column.
  • There are also other options for writing new records, see the javadocs.
  • Write DataRecord

  • With the writer setup in a session variable, you're ready to add records during the scraping session.
  • The easiest way is to call a script after each application of an extractor pattern that contains all the data you need.
  • Here is an example of such a script.
  • //Just matched an extractor pattern with extracters "BRAND_NAME" and "PRODUCT_TITLE"
    CsvWriter writer = session.getVariable("WRITER");
    writer.write(dataRecord);
    writer.flush();
    session.addToNumRecordsScraped( 1 );
  • Since the DataRecord already has the necessary key values, all you have to do is call writer.write(dataRecord)
  • The writer.flush() call causes the record to be written immedietly to the file. Otherwise writes are buffered and written in chunks for better performance. (see javadocs for FileWriter)
  • Close the writer

  • After the scraping session is finished you should have a script run that will close the CsvWriter
  • //scraping session close script
    CsvWriter writer = session.getVariable("WRITER");
    writer.close();