Visual Web Ripper Logo Visual Web Ripper Logo

Highlighted features

Project Summary


Extracting data from YouTube

requested: 2/23/2010 version: 2.32.1

Demonstrates how to use an input data source to provide start URLs

Target URL: http://www.youtube.com/watch?v=bFinpvgCvR8

Download demo projects and sample data extracts YouTube.zip

Request

How would you extract the title and embed code for this video and same data for ALL videos from same channel (bobvila)
All help would be appreciated.
 

Solution

Extracting the required data from YouTube is very easy. However, following all links in the bobvila catalogue is difficult, because the links are loaded on-the-fly while scrolling down the list, and Visual Web Ripper does not support this kind of AJAX functionality.

The easiest (and must reliable) way to extract the YouTube data is by first creating a projects that extracts all the video links and then use the extracted links as start URLs for another projects that extracts the YouTube data.

All the video links are displayed on this website URL:

http://www.youtube.com/profile?user=bobvila#g/u

I use a wait element to wait for the last link in the list. This element will not exist before you scroll down to the last element, so you should only set this element to wait while you are running the project.

To run the project do this:

1. Edit the waitElement content element and set the "Wait for element" option.

2. Run the project, but make sure you set the "View browser" option, so you can see the browser when the projects runs.

3. Visual Web Ripper will now open the web page and wait for the last link. You need to manually scroll down to the last element in the list, and Visual Web Ripper will then continue to extract all the links.

4. When the project has finished running, reset the "Wait for element" option, so the project will open without issues next time.

You now have all the start URLs for the next project. Copy the CSV file with the start URLs to the project folder and run the next project (Youtube2.rip), which will extract the title and embed code for the videos.

Download demo projects and sample data extracts YouTube.zip

Discuss this project

2/24/2010

Loading youtube2 with youtube1 result in folder
Then just hitting run project getting erro
Object Reference no set....
 

2/24/2010

Sequentum Support

The input data source was set to Youtube2_data1.csv instead of Youtube1_data1.csv, so unless you changed that it would cause an error, but not the error you're describing. I'm unable to duplicate this error on any of our test computers.
 
The project was created with the latest release of Visual Web Ripper. Did you upgrade to the latest release?

2/24/2010

Where do i change input data source ?

2/25/2010

Sequentum Support

You can use the Porject menu "Input Data Source" or the toolbar button "Input Data Source".

2/25/2010

Hi,
 
Instead of using project 1 can I use a youtube feed (http://gdata.youtube.com/feeds/api/users/LokSattaParty/uploads?) or any other feed as input url in project 2. !!!!
 
Rajiv

2/26/2010

Sequentum Support

That is not currently possible, but it's a very interesting idea, so we may add such a feature in the not so distant future.

3/2/2010

youtube2 working very well, If i had to say collect all urls based on keyowrds
say I had 5 keywords for youtube kitchen fittings, bathroom accessories, cabinets, shelves and utensils.
For these 5 different keyword searches can i use one project ( modify youtube1 ) and get urls.
So i dont have to run a single project 5 times instead enter the 5 different keywords in one project ?

3/3/2010

Sequentum Support

If you just want to use the Youtube search bar, then you don't need to use two projects at all. See the article below for how to setup a project that submits a form for different values.
 
http://www.visualwebripper.com/Resources/KnowledgeBase/KnowledgeBaseDetails.aspx?af_pk=3451

  Required Field - required field
Comment Required Field
Attachement
Loading...
Add
  • Very user friendly visual project designer.
  • Extract complete data structures, such as product catalogues.
  • Repeatedly submit forms for all possible input values.
  • Extract data from highly dynamic web sites including AJAX web sites.
  • Web data extraction scheduler with email notifications and logging.
  • Custom post-processing and comprehensive API.
  • Only $299 including 1 year maintenance.

© 2009-2010 Sequentum  |  Terms & Conditions  |  Privacy Statement  |  Login