Pull Underlined Text With Spreadsheet Reader

Support/help with CloverETL (4.9) and CloverDX (5.0 or newer) implementation problems

hesske
Posts: 15
Joined: Wed Jan 03, 2018 7:31 pm

Pull Underlined Text With Spreadsheet Reader

Postby hesske » Sat Jan 26, 2019 3:53 pm

Is it possible to pull the bold or underlined formatting with the text in Spreadsheet reader? I have an excel file that has a long string in each cell (very poorly formatted excel file I was given). Some strings are underlined. Those are the cells I need to pull. Is this possible?
Thanks

andras.csore
Posts: 18
Joined: Wed Oct 24, 2018 2:38 pm

Re: Pull Underlined Text With Spreadsheet Reader

Postby andras.csore » Thu Jan 31, 2019 4:10 pm

My add-on note not exactly how to do it (as i guess, by default its may not works), but a little trick about Excel vs. Underline... Because whatever method you try to decide the underline, may you need take care of one little bit...

1st: underline is not Boolean, but some 'style' value. Like Underlined = 2, double underlined = 5, etc... The not underlined is -4142...
2nd: and IF just part of the text is underlined (example, only the 2nd word), so its 'mixed', the result is null
So, may wort to check to booth cases...

I do some similar case in the past with the following way: i create a small VBA Excel macro, its check the cell is has some formatting, and if it is, its copy the cell value to new column (so, practically its 'cleanup' the data).
Example VBA (Excel) script:

Code: Select all

Sub underline_checker()
' underline_checker Macro andras.csore
' in Excel the Font.Underline value is not Boolean, but its the style of the underline (single, double, etc...)
' The not underline return value is : -4142
    Dim underline_type As Integer
    For ciklus = 1 To 3
        If IsNull(Cells(2, ciklus).Font.Underline) = True Then ' Null return if only a part of the cell value is underlined...
            underline_type = 0
        Else
            underline_type = Cells(2, ciklus).Font.Underline
        End If
       
        If underline_type <> -4142 Then  ' If not equal with no underline, we write-out (print) to another column...
            Cells(2, ciklus + 4).Value = Cells(2, ciklus).Value
        End If
    Next
End Sub

Screenshot:
2019-01-31 15_53_53-underline_vba.xlsm - Excel.png
2019-01-31 15_53_53-underline_vba.xlsm - Excel.png (3.82 KiB) Viewed 423 times

The small code check the row=2, col 1..3, put cell value to new columns (5..7) if its underlined...

However, i think may the best solution to implement some similar code with java library code for that (like jExcel, etc...)?

p.s.: may if the formatting is not 'hardcoded', but Excel conditional formatting, some 3rd party library may not parse this conditional formatting well. But if its conditional, may you also able to build the logic for that...

bartonv
Posts: 117
Joined: Wed May 03, 2017 12:10 pm

Re: Pull Underlined Text With Spreadsheet Reader

Postby bartonv » Wed Feb 06, 2019 11:46 am

Thank you, Andras, for the reply. We appreciate your input!
Best regards,
---
Vladimir Barton
CloverCARE Support
CloverDX

Visit us online at http://www.cloverdx.com

bartonv
Posts: 117
Joined: Wed May 03, 2017 12:10 pm

Re: Pull Underlined Text With Spreadsheet Reader

Postby bartonv » Wed Feb 06, 2019 11:53 am

Hi Hesske,
I am afraid there is no native function that would be able to extract this type of Excel string format within CloverDX. However, there might be a different approach that we could take to parse this Excel file of yours. Would you mind providing us with a snippet of it? If it contains sensitive information, kindly manufacture a sample that would mimic the original file content pattern.
Regards,
---
Vladimir Barton
CloverCARE Support
CloverDX

Visit us online at http://www.cloverdx.com

hesske
Posts: 15
Joined: Wed Jan 03, 2018 7:31 pm

Re: Pull Underlined Text With Spreadsheet Reader

Postby hesske » Thu Feb 07, 2019 4:59 pm

Thank you Andras and Vladimir. I have zipped and attached
Availability_Sheet_All_Days_test.zip
(8.04 KiB) Downloaded 22 times
a sample of the spreadsheet I am working with. I have scrubbed the names, but left the formatting in tact. I am wanting to pull specifically the records that are underlined. I have gotten a manual way to remove all non-underlined records by using Replace and formatting. But if there is a way I could do everything in CloverDX, I would much prefer that. Thanks again!

bartonv
Posts: 117
Joined: Wed May 03, 2017 12:10 pm

Re: Pull Underlined Text With Spreadsheet Reader

Postby bartonv » Fri Feb 08, 2019 11:46 am

Hi Hesske,
unfortunately, there is nothing else in CloverDX we can offer to handle this specific use case. Nevertheless, I have done some further research and found this interesting article about how to identify bold cells in Excel directly. I tried the third option (using VBA) which is pretty straight-forward and works smoothly but mainly, it enables you to assign a boolean value to a neighboring field saying whether the field is bold or not. This can be further used in CloverDX as a filter to filter out the desired records. I hope this helps.
Kind regards,
Vladi
---
Vladimir Barton
CloverCARE Support
CloverDX

Visit us online at http://www.cloverdx.com

hesske
Posts: 15
Joined: Wed Jan 03, 2018 7:31 pm

Re: Pull Underlined Text With Spreadsheet Reader

Postby hesske » Sat Feb 09, 2019 12:56 am

No worries, I knew this was going to be a very odd case when I posted it. Thanks for all the replies :)