Importing HTML Tables into a Calc Document

From NeoWiki

Revision as of 02:37, 4 December 2008 by Sardisson (Talk | contribs)
Jump to: navigation, search
This article is not up to date, or it needs structural or stylistic changes to conform with the rest of this wiki.

The Calc module of NeoOffice allows you to save tables from HTML pages. Several methods are available, and one of them allows you to synchronize your tables with those of the internet site.

Contents

Saving without synchronization

With almost all web browsers, you can select the table on the web page and then simply copy/paste it in a spreadsheet.
Sometimes all the table is pasted in only one column. If that is the case, then you can use one of the following methods:

First method

  • Copy the table Cmd-C.
  • Open a Calc document and in the Edit menu, choose the Paste Special… and one or the other option can give the desired result.

Second method, usable if the page is publicly accessible

  • Open a Writer Web document.
    • To do that, in the File menu, choose the New sub-menu and the HTML Document item.
  • Copy the URL of the web page which contains the desired table then paste it in the Load URL field located on the top and on the left side of the document toolbar.
  • Press Enter or Return.
  • Select the table you want to insert in a spreadsheet and copy it Cmd-C.
  • Paste it Cmd-V in your Calc document.

Saving with synchronization

The web page must be publicly accessible.
If you want your table to be synchronized with the web page, you can proceed in the following way:

  • Open the web page which contains the table you want to import.
  • Copy the URL.
  • Open a Calc document and then select a cell.
  • Choose the Link to External Data… sub-menu of the Insert menu.
  • Press Enter or Return.
  • In the dialog that appears, you'll see HTML_tables and HTML_all among several options in the Available tables/ranges section.
  • Select the table you want to insert and click OK. If you are not able to find the desired table, select HTML_all. Later you will be able to copy/paste the part of the web page of interest to you.
  • You may specify a synchronization rate by checking the Update every checkbox and setting the synchronizations rate in the provided box.

Opening Auto-Generated xls files in Calc

Some websites create auto-generated files that are actually HTML files, but which have an xls extension. Excel will open these files correctly, but if you try to open them with NeoOffice, they will open in Writer, because NeoOffice recognizes that they are HTML files. offered the following solution in the trinity forums.

This procedure assumes that the troublesome spreadsheet has been saved to your local machine - that is, the file that has the embedded HTML forcing it to open in Writer/Web.

  • Open a new blank spreadsheet in Calc.
  • In the Insert menu, choose Link to External Data….
  • The External Data window appears, asking you for the "URL of external data source" - obviously, you don't need the URL as the file is already on your hard drive. So then, click the browse button (...) just to the right of the URL field. Browse to the location of the spreadsheet, select the file, and Calc will attempt to find the data tables in the spreadsheet. In my experience, this was not a quick process - be prepared to wait a minute or two as Calc digs though the file attempting to make sense of what it's found.
  • Once Calc has sifted through the file you'll see a few choices in the Available tables/ranges section - I had several options from which to choose, including HTML_tables, HTML_all, and HTML_1
  • Select the table you want to insert into the spreadsheet and click OK. I simply selected HTML_all and got what I needed, but I suppose that you could select only specific tables if you wanted to exclude some data.



This article in other languages: Français Italiano
Personal tools