Sankaku Complex Forums » General

The 288,945 most popular sites on the internet

  1. A large-scale scan of the top million web sites (per Alexa traffic data) was performed in early 2010 using the Nmap Security Scanner and its scripting engine. As seen in the New York Times, Slashdot, Gizmodo, Engadget, and Telegraph.co.uk ...

    We retrieved each site's icon by first parsing the HTML for a link tag and then falling back to /favicon.ico if that failed. 328,427 unique icons were collected, of which 288,945 were proper images. The remaining 39,482 were error strings and other non-image files.

    http://nmap.org/favicon/

    Site Information for SankakuComplex.com
    * Alexa Traffic Rank: 4,154
    * United States Flag Traffic Rank in US: 2,781
    * Sites Linking In: 1,666

    Sankaku Complex is missing.

    Posted 5 years ago # Quote
  2. mizore said:

    A large-scale scan of the top million web sites (per Alexa traffic data) was performed in early 2010 using the Nmap Security Scanner and its scripting engine. As seen in the New York Times, Slashdot, Gizmodo, Engadget, and Telegraph.co.uk ...

    We retrieved each site's icon by first parsing the HTML for a link tag and then falling back to /favicon.ico if that failed. 328,427 unique icons were collected, of which 288,945 were proper images. The remaining 39,482 were error strings and other non-image files.

    http://nmap.org/favicon/

    Site Information for SankakuComplex.com
    * Alexa Traffic Rank: 4,154
    * United States Flag Traffic Rank in US: 2,781
    * Sites Linking In: 1,666

    Sankaku Complex is missing.

    So are numerous other sites.

    Posted 5 years ago # Quote
  3. They probably don't include porn sites

    Posted 5 years ago # Quote
  4. They did. I can spot a few.

    Posted 5 years ago # Quote
  5. The survey was a fail...

    Why are some sites not found?
    There are a few possible causes. First, the site may not have been among the top million at the time the survey was done. Check the data file to see if it was present. Second, the site may have changed its icon since the survey was done. This page downloads the current icon of the site you type in, and looks up its hash in a database. Failing that, it will look up the site name in the database, but that only works if you use the exact same name we did when doing the survey. Third, it's possible that the site timed out or didn't have an icon at the time of the survey. Fourth, this page limits the size of the icons it will download. If an icon file is too big, it won't be found. Calculate the MD5 sum of the icon yourself and enter it in the search box.

    Why are some icons (Amazon, Bing, Baidu) so small?
    This usually indicates that the main site timed out during the survey, and only less popular sites using the same icon responded. In other words, it represents a data collection error. For example, baidu.com didn't respond, but baidu.hk and baidu.jp did, and so what would have been one of the biggest icons is instead small. See this page for more technical details and caveats. We didn't fudge the data after the survey or attempt to fill in any obviously "missing" icons.

    Posted 5 years ago # Quote

Reply

You must log in to post.