Toggle menu
876
3.8K
30.2K
279.1K
Catglobe Wiki
Toggle preferences menu
Toggle personal menu
Not logged in
Your IP address will be publicly visible if you make any edits.

Retrieve data from top-1000-company site

From Catglobe Wiki
Revision as of 06:32, 7 April 2010 by Catglobe (talk | contribs) (Created page with '== Introduction == This Solution is used to retrieve data from site http://top1000.dk/ and then import to Catglobe system == Solution == We have * Server Path: \catproc\Sha…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Introduction

This Solution is used to retrieve data from site http://top1000.dk/ and then import to Catglobe system

Solution

We have

  • Server Path: \catproc\Share\CatGlobe Teams\Production\Top 1000\
  • Use Tool Top1000dk.exe to get raw data from http://top1000.dk/ site from server path
  • Customize data, import to Excel file
  • Run Macro (get Macro source code from from server path


Step 1 - Generate files by Internal Tool

  • Run Top1000dk.exe Tool, input index of page. Generate all pages. Note: run only one instance of this tool at a time.

<Top1000Tool_1 Image>

  • Go to your local C:\ to get exported files

Step 2 - Customize raw data

  • Merge all files we get from step1 to one file
  • Replace character "~" as "$$" and "|" as "$" (the purpose is separating into columns when importing these files to Excel)

<Customize data Image>

  • Save file as txt format

Step 2 - Import to Excel

  • Open Excel application -> select Data tab -> Get External data -> From Text
  • Browse the txt file we get from step2
  • Follow step 1: Choose Delimited option
  <Import 1 Image>
  • Follow step 2: Unselect "Tab" option, Select "Other" option = @
  <Import 2 Image>

Step 3 - Customize Excel data

Because of limitation of the original site http://top1000.dk/, we encounter some problems with Unicode and long names which are displayed as "..."

  • Open file "Unicode name" from server path to get list of incorrect names and replacements.
  • Replace all incorrect unicode character in file we get from Step2 by correct ones. Tips: we use IFERROR and VLOOKUP functions

<Vlookup Image>

  • Similarly, replace three-dot companies from the file 3-dot companies. We get the list of replacements of three-dot companies from file "3-dot companies"


Step 3 - Run Macro

  • Get Macro source code from server path
  • Orderly run these functions:

- Run Macro "top1000One"
- Run Macro "top1000Two"
- Run Macro "top1000Three"
- Run Macro "top1000Four"