Reading in dynamically numbered column CSV

tdawg225 · May 28, 2012, 12:00am

Been using Clover 3.1 for the last few months and I have come up with a problem that has stumped me for the last week. I can’t find anything in the documentation, forums, examples that has helped resolve my issue, so I’m hoping someone can help.

I am trying to read in multiple CSV files that contain response times from some tests we have been running. Unfortunately when the CSVs are generated the number of columns varies from file to file depending on how many sub-tests there are within each test. I can’t figure out how to read these into Clover so that I can reformat and output to something we can use.

Example

File 1

Percentile, Test1a, Test1b, Test1c
0.1,20,34,15
0.2,21,34,19
…
99.9,20,35,18

File 2

Percentile, Test2a,Test2b,Test2c,Test2d
0.1,40,22,41
0.2,40,38,40
…
99.9,20,35,50

As you can see both files have “Percentile” as their first column, but all subsequent column names are different. All other cells contain the response times and this is what I’m interested in. Unfortunately because the number of columns is dynamic I haven’t been able to use the UniversalDataReader and define the edges, because the number of delimiters always changes.

I’m wondering if anyone else has had this issue before and how they handled it? There are a couple possible solutions I see:
-Use a DataGenerator and write a java class to read the csv files and put it in a record. I have discarded this solution because it looks to me like DataGenerator requires a static number of records to generate. The number of records to generate for me would be 1000 x number_of_columns which is dynamic depending on the file.
-Use a MultiLevelReader. From what I have read the multiLevelReader may achieve what I’m looking for but I can’t figure out which Java interface I’m supposed to be using, because I can’t use a lookup table.
-Create a custom component. I don’t have a lot of experience with creating custom components, but if this is the best option I’ll move forward with it.
-Use JavaExecute component to run some Java code that will reformat all CSVs into a format that Clover can understand (i.e. each row would be in the format of TestName, percentile, response time)

If there are other options that I have missed please let me know!

vacekm · May 29, 2012, 2:51pm

Hello,

in your case you need to dynamically generate metadata. For this you need to create one graph which will generate the metadata and then another which will use them and perform whatever you like. Please take a look into this thread where I posted an example showing this: viewtopic.php?f=4&t=5484

For you the important part is the component XMLWriter in the graph EndecaGenerateMeta. You can see there how to generate the *.fmt file.

So to sum up:
- create a metadata generating graph, in which you
- extract all the test names from your CSV
- clean the test names of all whitespaces, since the metadata fieldnames can’t contain spaces and so on (you can see how to do this in the example as well)
- use the XMLWriter to create the external metadata file
- you may add a RunGraph component into the last phase, which will call the worker graph
- create a graph that will use the generated metadata (worker graph)
- in here you can read your data by UniversalDataReader using the generated metadata.

This can be done for each test file. I’m not sure if you need to somehow mix the data from more files together. If so then you’d need different metadata for each file.

Topic		Replies	Views
Reading unstructured data from Excel CloverDX Platform	4	2	April 28, 2014
Reading Column Headers from an Incoming CSV file CloverDX Platform	1	18	May 10, 2017
How to read CSV file recursively? CloverDX Platform	6	3	October 23, 2012
Testing CloverDX Platform	1	1	February 8, 2008
Dynamic file name in Reader? CloverDX Platform	2	6	July 16, 2007

Reading in dynamically numbered column CSV

Related topics