cancel
Showing results for 
Search instead for 
Did you mean: 

trim leading zeros from text '0000000-4005-1' ?

RADSr
Partner
Partner

Here's another little puzzler - 

I'm trying to trim the leading zeros from this string '0000000-4005-1'  - I can't just replace zeros w/ nulls because '4005' still needs its zeros. ( also, the field won't always contain a dash so I can't use that in my logic ).

My plan was to replace the zeros with a space, trim the string, and the replace the space with a zero.

However, replace('000123401D' , 0, '  ') nets me '-4 5-1'      I tried putting a backslash and a space inside the single quotes ( '\ ' )  and Incorta barked at me.  Putting two backslashes *and a space* inside the single quotes nets me '\\\\\\\-4\\5-1 which gets me closer, but not quite there.

here are some test cases:

0000000-4005-1

00043317006556

012993102012DE

8549340076315.      << yes, that's a decimal point at the end -- it doesn't have leading zeros, but shows me that I basically can't count on any characters being prohibited by the data entry program.  

Ideas?  Maybe a way to force an ANSI char(52) in there v a space-bar space?    I've come across and remember Incorta having trouble with spaces in filter expressions/prompts but don't recall how we worked around it.  

4 REPLIES 4

RADSr
Partner
Partner

Patience is a virtue ...   in which I am lacking  😉  

I added this to the SQL in the Incorta table data source:

trim(CASE 
    WHEN LEFT(replace(ltrim(replace(zeroCode, '0'' ')), ' ''0' ), 1 ) = '-'
    THEN substring(replace(ltrim(replace(zeroCode, '0'' ')), ' ''0' ) , 2255)
    ELSE (replace(ltrim(replace(zeroCode, '0'' ')), ' ''0' )  )
    END )  AS zeroCode_CLEAN
 
Since the SQL function recognized and keeps the ' ' as a space it looks to be working.
 
If anyone figures out how to do this in an Incorta formula I'd love to know the answer.
 
I'd also still like to see some discussion/exposition on how Incorta handles spaces  ( e.g. if I filter on <field> does not contain  'TEST ' ( with a space after the T ) it gets rid of 'Test case' but it also gets rid of 'DETEST' ( with no space after the T )
 
 

I can reproduce your source SQL based approach using Oracle:

 

 

SELECT
    trim(
        CASE
            WHEN substr(
                replace(ltrim(replace(zeroCode, '0', ' ')), ' ', '0'),
                1
            ) = '-' THEN substr(
                replace(ltrim(replace(zeroCode, '0', ' ')), ' ', '0'),
                2,
                255
            )
            ELSE (
                replace(ltrim(replace(zeroCode, '0', ' ')), ' ', '0')
            )
        END
    ) AS zeroCode_CLEAN
FROM
    (
        SELECT
            '0000000-4005-1' zeroCode
        FROM
            DUAL
        UNION ALL
        SELECT
            '00043317006556' zeroCode
        FROM
            DUAL
        UNION ALL
        SELECT
            '012993102012DE' zeroCode
        FROM
            DUAL
        UNION ALL
        SELECT
            '8549340076315.' zeroCode
        FROM
            DUAL
    )

 

 

The logic can be implemented using Incorta formula, exactly using the logic you described:


@RADSr wrote:

replace the zeros with a space, trim the string, and the replace the space with a zero.

 

 

replace(
	lTrim(
		replace(
			MVTest.trim_with_spark.col, 
			'0', 
			' '
		)
	), 
	' ', 
	'0'
)

 

 

I also have tested a Spark version:

 

 

select col
, length(col) orig_len
, replace(col, '0', ' ') replaced_col
, length(replace(col, '0', ' ')) new_len
, ltrim( replace(col, '0', ' ')) trimed_col
, length(ltrim( replace(col, '0', ' '))) trimed_len
, replace(ltrim( replace(col, '0', ' ')), ' ', '0') result_col
from MVTest.testcase_for_trim

 

 

For this type of data preparation/cleaning, I will probably go with SparkSQL and develop the SparkSQL using Incorta Notebook.

Screen Shot 2022-08-18 at 12.34.12 PM.png

 Hope this help

 

 

natarajasokan
Rocketeer
You can try this in Spark SQL- regexp_replace(zerocode, '^[0]*', '')

Great add @natarajasokan !