cancel
Showing results for 
Search instead for 
Did you mean: 

Upper limit on datasets for a table ( and how are the processed ) ?

RADSr
Captain
Captain

I have a scenario in front of me where I have a bunch of identical ( columns ) Incorta tables -- i.e. I have a table for each month of the year for the past 2+ years which I want to consolidate into one big table w/ all the data inside. 

Monthly tables:   Pulled from original source.  Broken into monthly tables because they return exponentially faster that way.

Consolidated table:  IncortaSQL   apparently does not support doing a big UNION or UNION ALL  and we were getting memory errors using SQLi with the big UNION statement.   

What *IS* working is having all the month tables as separate datasets via SQLi - and it's pretty fast!   

That said, we must remain optimistic and assume the calendar will not stop. 

So we will need to keep adding month tables as we go ( we have 2023 covered, w/ July forward obvs not having any data ) 

So - finally - the question:  Is there a limit to the number of datasets which Incorta will support?   

Corollary question - how are these processed?  Does Incorta send 36 requests ( to itself in this case ) at the same time?  Is that governed by the parallel job tasks config in the CMC?   I can't imagine they are all done in series, although there must be some staging somewhere w/in the mix because I am having Incorta enforce primary key constraints. 

I'm very excited about this - give me some good news Community members!  😉   

Screen cap - 3 of 30! datasets  ( 2021 combines Jan-Jun as they were relatively low volume months )  

RADSr_0-1687458262087.png

 

  

-- IncortaOne@PMsquare.com --
5 REPLIES 5

RADSr
Captain
Captain

TL/DR - making these modular will give me more flexibility and increase my ability to move corrective changes into the most used dashboards in the shortest time possible.  

------------

That said, the primary reason for staging into Incorta first is to meter/limit the activity against the source DB .    The source is a view on a production box.  The view is ( I gather ) pretty heavy and is completely "black box" to me, so I can't do any tuning or even recreate it w/in Incorta.   The short is, because it's production I don't want to be a burden on the system which may affect operations.

Currently I have three schemae ( 2021, 2022, 2023 with 7, 12, and12 tables respectively  ) which I can run in series or independently and which can be further broken down if need be.

So - use case:  A change is made to move a product to a different category and that change needs to be reflected ASAP for the WTD/MTD/QTD/YTD reporting I can choose to run only the tables necessary for the immediate need *now* and be running correct dashboards very quickly while correcting further history at my schedule. 

 

 

-- IncortaOne@PMsquare.com --