Quantcast
Channel: Toad Data Point Forum - Recent Threads
Viewing all articles
Browse latest Browse all 2544

RE: splitting excel file in automation process based on rows number

$
0
0

Hello thunderlights880,

Thank you for the post.

It is possible to do it - here is an idea to get you started with:

  1. add "Set variable" activity
    1. add variable "FileRow" of type integer and value 0
    2. add variable "FileRowCount" of type integer and value 0
  2. add "Execute script" activity
    1. (Now we will create two tables - one that contains all the data and second that will be filled later with always 250k rows of data extracted from the first table)
    2. DROP TABLE if exists myTempTable ---- the "if existst" part differs among providers.. this should work in MySQL for example
    3. CREATE TABLE myTempTable (id with primary key.., then your columns here)
    4. INSERT INTO myTempTable SELECT <your_large_data>) (let's say we have 2.000.000 rows (so 8 excel fiels if one should be per 250.000)
    5. DROP TABLE if exists myTempTableOut
    6. CREATE TABLE myTempTableOut (id with primary key.., then your columns here)
  3. add "While..." activity (for loop)
    1. the condition would be #FileRow# == 0 or #FileRowCount# > 0
  4. add "Set variable value" activity into the while branch
    1. increment the variable FileRow by 1 (i.e. every new loop iteration will increase the file row variable)
  5. add "Execute script" activity into the While branch
    1. change the row count variable name to "FileRowCount" (originally it's something like Execute_1_RCOUNT)
    2. TRUNCATE TABLE myTempTableOut
    3. INSERT INTO myTempTableOut SELECT TOP 250000 * FROM myTempTable
    4. DELETE FROM myTempTable WHERE id in (SELECT DISTINCT id from myTempTableOut)
    5. SELECT id from myTempTableOut
  6. add "Log comment" activity into the while branch
    1. this will be just for diagnostics
    2. add text "file row: #FileRow#, output row count: #FileRowCount#"
    3. you need to ensure that it will give you the exact row count that is currently present inside the myTempTableOut.. (the last select statement ensures to fill the variable)
  7. add "Select to file" activity into the while branch
    1. SELECT * FROM myTempTableOut
    2. specify output type Excel and for the file path use "myOutputFile_#FileRow#"
    3. the above step ensures that each new file generated will get _1 _2 _3 etc. in the ending
  8. add "Archive" activity at the end of the script
    1. zip all files that match the pattern myOutputFile_*.xlsx
  9. add "Send email" activity at the end of the script
    1. attach the zipped archive file

Hope it makes sense (i didn't test it, i only wrote it from my head)

Martin


Viewing all articles
Browse latest Browse all 2544

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>