As described in How to get your data into CT, CloudTurbine source data can be created in several ways:

  • use an existing CT source app such as CTstream or CTtext
  • manually create the needed folder/file structure and content to define a CT source
  • make a shell script which outputs data in the CT folder/file structure
  • use a programming language to create a custom CT source application

We’ll focus on the fourth method in this article, creating a CT source application using the CTwriter class from the CloudTurbine Java API.  Before we dive into a simple CT source example, here’s a few resources and tips for developing CT source apps:

  • It can be helpful to reference existing source code examples.  What better way than to peruse the source apps included in the official CloudTurbine distribution, available in a GitHub repository.  Good examples to start with are CTsource.java and CTblocktest.java.
  • Details of the Java API are provided in the Javadoc, available either online or in the JavaDoc/CTlib directory if you are building CloudTurbine from source.
  • You need a Java Development Kit installed on your machine to compile your CT source app.  We recommend using Java SE version 8 or newer.
  • The CloudTurbine library file, CTlib.jar, is available as part of the CloudTurbine releases or can be built from source.  CTlib.jar will need to be included on the javac classpath to build a CT app and on the java classpath to run a CT app.

In How to get your data into CT we showed how you can manually create the simple CT source shown in the folder/file structure below.  The first “output.txt” file contains the string “Hello” and the second “output.txt” file contains the string “World”.  This is a legitimate CloudTurbine source whose data could be read using the CTreader API class or viewed using CTweb/WebScan.

Let’s create a simple CT Java program which produces the exact same output source structure and content.  Start off by importing the CloudTurbine classes found in “CTlib.jar”.

Create a CTwriter object, giving it the desired name of the output source folder.

Use the setTime and putData methods to add data points to the source.  The times specified in the setTime calls are milliseconds since epoch (January 1, 1970), which is a common time-base for CloudTurbine sources.  If you call putData without a corresponding call to setTime, current wall-clock time in milliseconds since epoch will be used.  You can also pass setTime a floating point number, which will be interpreted as the number of seconds since epoch (and internally converted to milliseconds since epoch).  The calls to putData specify the channel name (“output.txt” in this case) and the string data.

Finish by flushing data and closing the source.  The call to flush is optional (since close itself includes a call to flush) but we include it here for completeness.

That’s it!  The complete program is shown below.

Beyond this simple example, the CTwriter class includes built-in support for a number of advanced options, including:

  • support for creating ZIP’ed output files
  • write data to an FTP or HTTP/S server
  • include an additional “segment” folder layer, which can be useful for optimizing the organization and retrieval of your data

The CloudTurbine Java API also includes a CTreader class for fetching data from CloudTurbine sources (i.e., creating “sink” applications).  Read more about this in the Writing CT sink apps article.

Alternate language bindings

While the CloudTurbine Java API is the officially supported programming interface for developing CT apps, APIs using other languages are possible.  A full-featured API should support the CloudTurbine file structure defined in the Structure document.  Developing a source/writer interface in particular is easier than developing a sink/reader interface (which needs to support a variety of folder/file structure options).  An implementation of the source/writer interface is available as a C# library for Microsoft Windows platforms.