Is there a component option that will allow the FlatFileReader to locate a file in a directory with the most recent create (or update) date? For example the directory I am reading from contains 4 files and new files will be added as dictated by the the business. I only care about reading the latest dated file that starts with the name test_. I know a * is a wild card but I haven’t seen any documentation on how to find the file name with the newest create date. Do I need to use the ListFiles component to first read the directory and then parse each file’s metadata to find my target file name?
Example directory: (I need to read file test_3.csv based off the criteria that its name starts with “test_” and it has the latest create date.)
test_1.csv file create date 2022-12-05
test_2.csv file create date 2022-12-13
test_3.csv file create date 2022-12-17
prod_A.csv file create date 2022-12-20
Hi Tim,
There are multiple ways you can approach this. One of which is to sort the files based on their last modified date and then pick the newest record (file). (I recommend building this logic as a subgraph for further reusability - e.g., you can add a parameter to specify which folder to check for new files)
ListFiles - point the path to the desired folder location.
Metadata - when you connect an edge from the output port of the ListFiles to (any) component, a metadata record containing available file information will be automatically assigned. We can take advantage of this information. In your case, we are interested in the fields lastModified and name (or URL). It’s up to you if you remove them or not.
ExtSort - sort the data based on the lastModified field descending, so the newest file is first.
Map - we want to use the Map component to “filter out” the unwanted records and leave only the first one. We can do this by creating a variable that we will use to prevent other records from sending to an output port (see the CTL code below).
//#CTL2
// initialize the variable with the value true as we want to process the first incoming record
boolean processRecord = true;
// Transforms input record into output record.
function integer transform() {
if (processRecord) { // processing the first record
$out.0.* = $in.0.*; // mapping data
processRecord = false;
return ALL;
} else { // first record processed
return SKIP; // discard
}
}
This will leave us with a single record containing the information regarding the newest file.
Additionally, you can use the File Event Listener if you want to process the file as soon as it “arrives” in a folder. Go to your CloverDX Server web console > Event Listeners > New Listener > File event Listener > Set up the Listener (Type of check = File added).
Regards,
Ladislav.
Hello Ladislav,
Thank you for the assistance. The map function you provided worked to filter out the first record from the ListFiles set. Going back to the example data I provided there are files in the directory with a different name and newer date that I do not want to be included (prod_A.csv). To remove those files from the data set I added a Filter component before the ExtSort to ensure only files that started with the name “test_*” would be included in the end result set. With the URL value for file test_3.csv isolated I was able to pass it into the FileReader.
//#CTL2
$in.0.name?=“test_*”;
Hi Tim,
Glad I could help! Yes, the regexp contains operator (=?) is the way to check for a regex when filtering. In addition to that, you can also use startsWith(string input, string prefix) or endsWith(string input, string suffix) functions if you are looking for a certain prefix/suffix in the checked string.
Ladislav.