#!/usr/bin/perl # << 'HEADER_END'; NAME lang_srom.per -- periodic task for the Languages Sucks/Rules chart DESCRIPTION This code updates the content at: http://www.mrob.com/lang_srom.html http://www.mrob.com/lang_srom.txt It uses the SimpleGet.pl library, a stand-alone file containing the bare minimum of code necessary to implement the functionality of the LWP::Simple package. For more info about SimpleGet, go here: http://www.mrob.com/SimpleGet.txt Important: If you publish web pages that contain links to Alta Vista (as does the page generated by this script) you must agree with terms of use specified by the AltaVista companiy. Their noncommercial use terms are quite reasonable. Find the "terms of use" link at the bottom of their home page. This script is based on the Operating System Sucks-Rules-O-Meter (SROM) Here is its original copyright notice: Copyright 1998 Electric Lichen L.L.C. Don Marti This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. If you got this from the website www.mrob.com, the GNU General Public License may be retrieved from the following URL: http://www.mrob.com/ries/COPYING.txt Otherwise, or for more information, go to: http://www.fsf.org/copyleft/copyleft.html NOTES REGARDING GROWTH OF THE INTERNET From bwebd logfiles: perl perl perl c sucks c rules c rocks sucks rules rocks 20000218 30 316 3 20000514 173 5420 372 34 402 28 20000614 156 6323 395 28 387 22 20000718 163 6684 396 20000808 161 6673 395 20000901 154 6174 384 Google's cache of my page (from 2000 July 26) shows the same number (13359) at the top-left of the chart as the current (2000 Sep 19), so probably that number has been the same that whole time. However, language symbols on the chart moved in that time, so AltaVista has been changing. 6666 hits for C and C++ (m3) Tue Sep 19 21:08:01 2000 5908 hits for C and C++ (m3) Thu Oct 5 13:06:00 2000 hereafter the most-mentioned language and its hitcount are given in a terser format, and the 6 numbers at the end are the values of: C C C perl perl perl sucks rules rocks sucks rules rocks Fri Oct 6 01:19:34 2000 (m1) [5908 C and C++] 144 5403 361 29 308 28 Sat Nov 4 20:20:58 2000 (m1) [5875 C and C++] 142 5376 357 29 309 28 Sat Dec 2 09:06:26 2000 (m1) [5831 C and C++] 142 5347 342 29 310 28 Sun Dec 31 01:43:44 2000 (m1) [6136 C and C++] 492 5326 318 57 312 50 Tue Jan 30 00:55:40 2001 (m1) [6134 C and C++] 494 5323 317 30 311 28 Thu Feb 8 00:12:03 2001 (m1) [5788 C and C++] 145 5324 319 30 264 49 Thu Apr 12 16:30:08 2001 (m1) [10789 C and C++] 295 10087 407 52 429 40 Thu Apr 26 18:21:18 2001 (m1) [10789 C and C++] 295 10087 407 51 421 37 Sat May 26 03:14:53 2001 (m1) [8740 C and C++] 192 8118 430 50 155 43 Mon Jun 11 12:28:18 2001 (m1) [8820 C and C++] 189 8194 437 49 153 43 Wed Jul 18 23:24:52 2001 (m1) [8305 C and C++] 180 7704 421 47 149 42 Tue Jul 24 07:47:23 2001 (m1) [8305 C and C++] 180 7704 421 47 149 42 REVISION HISTORY Don Marti: revised 19 Mar 1998 -- added $rule_offset revised 3 June 1999 -- new AltaVista result page format Robert Munafo: 20000113 Removed dependency on LWP; still depends on IO::Socket. I also replaced OS names with language names because http://orwant.www.media.mit.edu/tpj/rules is broken today and I wanted to know which sucked worse: Fortran or Cobol. 20000214 Use SimpleGet.pl library. Copy to /rhtf; convert into a .per script. Write code to generate 2-D scatter plot of results. 20000215 Add forth, lisp, postscript, tcl, smalltalk. Recognize "1 page found" returns (gives better results for rare languages). Add trademark notices. Add color to symbols, allowing for 5 different P's. 20000217 Add note about special cases Python, C. 20000414 "rightmost" example must have more than 100 hits 20000524 Fix formatting problems with MSIE 5 in MacOS 20000717 Add note about why Prolog isn't included. 20000801 Add similar note about REXX and Haskell. 20000919 Add logging of $maxhits to var/lang_srom_log, to track the growth rate of the Internet 20001005 Fancier permanent logging of hit counts for C and Perl 20001110 Add explanation about Visual Basic. 20010505 New Alta Vista query URL format and results format 20010717 Yet another results format, remove 'true basic' because no hits 20020220 Add "touch ." at end 20021111 New results format 20030314 Add Ruby 20030318 Add %mandates to facilitate accuracy of Ruby 20031205 Alta Vista's hostname changed back to www.altavista.com 20050309 New query URL format; make it run fast, just for debugging 20081014 Add note about growth of top hitcount; don't generate blank part of bottom of chart. 20100813 Switch to advanced query form, because quoting a phrase no longer works in the simple query form 20100928 Add cacheing mechanism 20100929 Log date along with each cached result 20100930 Save load error HTML in a temp file for diagnostic use. Restore "raw data" table (original 20000113 code from before I added the ASCII scatter-plot). 20101001 Clean up formatting of raw data table and permute the contents so it reads down column A, then down column B. 20101010 Switch to Yahoo query format; handle wildly inaccurate results counts. 20130121 Redo the output formatting because color tags inside
were no longer working.
 20181005 An instance of #lang_srom.per# hung my #bwebd#, I fixed it by
adding a default alarm timeout in SimpleGet.pl
 20181012 Update match patterns so it works once again. Clean up some
of the formatting and acknowledge Yahoo.
 20220910 Use #bget#; new URL format using 'fr=yhs-invalid' and new
match pattern for e.g. 'About 729 search results'

TO DO

20100515 The overall design, using AltaVitsa, should be deprecated and
replaced with data from a site like Amplicate (see
http://amplicate.com/software/list-of-programming-languages) provided
there is enough data to qualify (Amplicate currently doesn't cut it).

HEADER_END

use strict;
require 5.004;
my($hd) = $ENV{"HOME"};
require "$hd/bin/SimpleGet.pl";

###########################################################################
#
# Local variables containing static text

# Original and revised query URL formats

# 20101010: Switched over to a completely different method today, including
# the switch to Yahoo (using $yhurl). Each phrase is now quoted with
# Q_BEG and Q_END.

my $avurl = "http://www.altavista.com"; # 20000113
   $avurl = "http://www.altavista.digital.com";  # 20010505
   $avurl = "http://www.altavista.com"; # 20031205
my $yhurl = "http://search.yahoo.com"; # 20101010
   $yhurl = "https://search.yahoo.com"; # 20181012

my $SEARCH_PREFIX = qq{$avurl/cgi-bin/query?pg=q&what=web&kl=XX&q=}; # 20000113
   $SEARCH_PREFIX = qq{$avurl/sites/search/web?kl=XX&pg=q&q=}; # 20010505
   $SEARCH_PREFIX = qq{$avurl/sites/search/web?q=}; # 20010717
   $SEARCH_PREFIX = qq{$avurl/web/results?q=}; # 20050309
   $SEARCH_PREFIX = qq{$avurl/web/results?itag=ody&pg=aq&aqmode=s&aqp=}; # 20100813
   $SEARCH_PREFIX = qq{$yhurl/yhs/search?p=}; # 20100813
   $SEARCH_PREFIX = qq{$yhurl/search?fr=yhs-invalid&p=}; # 20220910

my $Q_BEG = "%22";
my $Q_END = "%22";

my $SEARCH_SUFFIX = ''; # 20000113
   $SEARCH_SUFFIX = qq{&pg=q&kl=XX}; # 20010717
   $SEARCH_SUFFIX = ''; # 20100813
   $SEARCH_SUFFIX = qq{&fr2=sb-top&fr=altavista&b=9999}; # 20101010
   $SEARCH_SUFFIX = qq{&fr2=sb-top&fr=altavista}; # 20181012
   $SEARCH_SUFFIX = qq{&ei=UTF-8&nojs=1}; # 20220910
   $SEARCH_SUFFIX = ''; # 20220910

my $MUST_PREFIX = "&aqa=";
   $MUST_PREFIX = "+%2B";  # ' +'
   $MUST_PREFIX = "%2B";  # '+'

my $STOP_PREFIX = "&aqn=";
   $STOP_PREFIX = "+-";

# 20100930 I might want to try: http://search.yahoo.com/yhs/search?p=applescript+rules&fr2=sb-top&fr=altavista and match on the text:
#   1,360,042 results for
applescript rules: # The list of languages is somewhat incomplete because many languages # have names that don't lend themselves well to search engine lookups. my %aliases = ( # 'Assembler' => ['assembly language'], # only a few hits, all with "rules" as a noun 'Basic' => ['visual basic'], # 'basic' alone hits too many pages # 'ms basic', 'microsoft basic', # no hits # 'true basic' # no hits as of 20010717 # 'integer basic', 'applesoft basic', # this proves I'm old 'Perl' => ['perl'], 'Objective C' => ['objective c'], # 'prolog' => ['prolog'], # "rules" is a noun # 'Haskell' => ['haskell'], # "rules" is a noun # 'REXX' => ['rexx'], # "rules" is a noun 'fortran' => ['fortran', 'f77', 'f90'], 'COBOL' => ['cobol'], 'C and C++' => ['c'], 'Pascal' => ['pascal'], 'Python' => ['python'], 'AppleScript' => ['applescript'], 'PHP' => ['php'], 'Tcl' => ['tcl'], 'lisp' => ['lisp', 'scheme'], 'forth' => ['forth'], 'Java' => ['java'], 'JavaScript' => ['javascript'], "Maple" => ['maple'], 'PostScript' => ['postscript'], 'Smalltalk' => ['smalltalk'], 'Ruby' => ['ruby'], 'zzz_end_flag' => [ ] ); my %stops = ( "C and C++" => ["a.c.sucks"], "Maple" => ["us.maple", "rich.maple", "concentric.strata"], "Python" => ["monty"], ); # These search terms are required for a match. They are useful for # languages like 'python', which share a name with something else that # is unrelated to the language, but where no obvious stop word like # 'monty' is available. my %mandates = ( "Ruby" => ["language"], ); my %synonyms = ('sucks' => ['sucks'], 'rules' => ['rules', 'rocks'] ); my %permlog = ('c' => 1, 'perl' => 2); # ...not to be confused with # (knit 1, perl 2) ack! my %pqidx = ('sucks' => 0, 'rules' => 1, 'rocks' => 2); my @pqarr; # %tag determines the letter and color each language will have on the # chart. The first character is the letter and must be uppercase; the # second character gives the color and must be lowercase. my %tag = ( # colors should be: k, r, o, g, b and then repeat # add new languages in alpha order and re-do the colors "AppleScript" => "Ak", "Basic" => "Br", "C and C++" => "Co", "COBOL" => "Cg", "forth" => "Fb", "fortran" => "Fk", "Haskell" => "Hr", "Java" => "Jo", "JavaScript" => "Jg", "lisp" => "Lb", "Maple" => "Mk", "Objective C" => "Or", "Pascal" => "Po", "Perl" => "Pg", "PHP" => "Pb", "PostScript" => "Pk", "Python" => "Pr", "Ruby" => "Ro", "Smalltalk" => "Sg", "Tcl" => "Tb", 'zzz_end_flag' => '.' ); my %lcstyle = ( "k" => "", "r" => "", "o" => "", "g" => "", "b" => "", "?" => "" ); my $lcoff = ""; # I set $myambiv to the name of a language that I think both # rules and sucks. It is used as an example in the text. my $myambiv = "JavaScript"; my $maxhits = 0; my $maxhwhat = ""; ########################################################################### # # Subroutines sub maptag { my ($t) = @_; my ($rv); if ($t =~ m/^([A-Z])([a-z])$/) { $rv = $lcstyle{$2} . $1 . $lcoff; } else { $rv = $lcstyle{'?'} . '?' . $lcoff; } return($rv); } # quoteit turns special characters into %-escapes for use as fields in an # HTTP GET-style form submission. sub quoteit { my ($s) = @_; $s = lc($s); if(0) { # Old mapping $s =~ s/ /%20/g; # 20 $s =~ s/\"/%22/g; # " 22 $s =~ s/\+/%2B/g; # + 2B } else { $s =~ s/\+/%2B/g; # + 2B $s =~ s/ /+/g; } $s = $Q_BEG . $s . $Q_END; return $s; } # This routine takes a number and formats it into a 7-character column, # right justified and padded with spaces. It uses scientific notation # when necessary. sub fmt_hits { my($hits, $i, $chartrows) = @_; my($h1); # $hits = $hits * 1234 * ($chartrows + 1 - $i); $h1 = $hits; if ($hits > 9999999999) { # e.g. 12345678901 -> 1.23e+10 $hits = sprintf("%7.2e", $hits); } elsif ($hits > 9999999) { # e.g. 12345678 -> 1.23e+07 $hits = sprintf("%7.3e", $hits); } $hits =~ s/e\+/e/; $hits =~ s/e0/e/; $hits = ",,,,,,," . $hits; # make the label blank except every 4th line (the commas get turned to # spaces later) $hits = ",,,,,,," if (($i % 4) > 0); # Keep the last 7 characters $hits =~ m/(.{7})$/; $hits = $1; # print "fmt_hits: '$h1' -> '$hits'\n" if ($hits ne ",,,,,,,"); return($hits); } # A simple cacheing interface to AltaVista census queries. # We store the answers of each query, as well as the time the query was made. # After a while this data might be useful as historical statistics. my(%cached) = (); my(%cache_when) = (); my($g_cache_pn) = "$hd/tmp/lang_srom_cache.txt"; sub init_cache { my($l); if (-e $g_cache_pn) { open my $F1, "<", $g_cache_pn; while($l = <$F1>) { my $v; my $url; my $when; chomp $l; if ($l =~ m/^([0-9]+) +([0-9]+) +(.+)$/) { $when = $1; $v = $2; $url = $3; $cached{$url} = $v; $cache_when{$url} = $when; } elsif ($l =~ m/^([0-9]+) +(.+)$/) { # Legacy data (without dates) is from 20100928 $v = $1; $url = $2; $cached{$url} = $v; $cache_when{$url} = 1285756252; } } close F1; } else { open my $F1, ">", $g_cache_pn; print $F1 "# Datafile for lang_srom.per\n"; close $F1; } } # End of init.cache my $fast = (shift != 0); my $longest_fail = 0; my $g_tdir = "$hd/tmp"; my $g_htm = "$g_tdir/lang_srom-temp.html"; my $g_hdr = "$g_tdir/lang_srom-hdrs.txt"; sub bget1 { my($url) = @_; my($rv, $IN, $l); system('bget', $url, '-O', $g_htm, '-D', $g_hdr); $rv = ''; open($IN, $g_htm); while ($l = <$IN>) { chomp $l; $rv .= "$l\n"; } close $IN; return $rv; } # End of b.get1 # Get a URL from cache, or from the Internet if cache is not available. # {We save results in a cache file to protect against a partly-down network # or unreliable server. There have been times when Alta Vista only responded # to a small percentage of our requests, resulting in u generating a # rather useless chart. -20100928} # # %%% This will eventually check to see how old the cached version is and # try to reload if/when it is older than about 1 month. # sub cached_get { my ($url) = @_; my $raw = -1; my $now = time; my $g1 = ''; if ($cached{$url} ne '') { return $cached{$url}; } # $g1 = get($url); $g1 = &bget1($url); sleep($fast ? 1 : 3 + int(rand(7))); # keep our impact low $g1 =~ s|| KEY_RC |g; $g1 =~ s|| |g; $g1 =~ s|([,0-9]+) results?| KEY_RC \1 results|g; $g1 =~ s|About ([,0-9]+) search results?| KEY_RC \1 results|g; $g1 =~ s|We did not find results for| KEY_RC 0 results |; # foo1 # if ($g1 =~ /([\d\,]+)\D+pages?\s+found/i) { # if ( # ($g1 =~ /we found about ([\d\,]+) result/i) # 20010505 # || ($g1 =~ /we found ([\d\,]+) result/i) # 20010717 # ) { if ( ($g1 =~ m/KEY_RC +([\d\,]+) +result/i) # 20101010 || ($g1 =~ m/AltaVista found ([\d\,]+) results/i) # 20021111 ) { $raw = $1; $raw =~ s/\D//g; $raw += 0; # convert string "no" to number 0 # %%% add it to the cache file { open my $F1, ">>", $g_cache_pn; print $F1 "$now $raw $url\n"; close $F1; } } else { if (length($g1) > $longest_fail) { # In order to diagnose the failed queries, we save the longest fail page # in ~/tmp $longest_fail = length($g1); my $fpn = "$hd/tmp/lang_srom_failure.txt"; print STDERR "failed URL: $url\n"; system("cat $g_hdr $g_htm > $fpn"); # open my $FAIL_LOG, ">", $fpn; # print $FAIL_LOG $g1; # print $FAIL_LOG "\n"; # close $FAIL_LOG; print STDERR " header and raw HTTP ($longest_fail bytes) are in $fpn\n"; } } return $raw; } ########################################################################### # # Main program $| = 1; # The "get-servid" tool discovers which server we are running on. This variable # is unessential, and I include it just for my own convenience. my $servid = `$hd/bin/get-servid`; chomp $servid; print "lang_srom started on service $servid\n"; # print " exiting.\n"; exit(0); # ~/bin/pe, if present, prints the environment variables in an alphabetized, # two-column format. This information is printed out for my convenience, but # is not essential to the operation of this script. # system("$hd/bin/pe"); # 20181012: I don't want/need this anymore &init_cache(); my $greatest = 0; my $least = 999999999; my %count = (); my $pop = ""; my $av_tried = 0; my $av_bad = 0; foreach my $lang (sort keys(%aliases)) { foreach my $alias (@{$aliases{$lang}}) { foreach my $quality ('sucks', 'rules') { $count{$lang}{$quality} += 0; # make sure it has a value to start with foreach my $synonym (@{$synonyms{$quality}}) { # Create the stoplist my $stop = ""; foreach my $st (@{$stops{$lang}}) { $stop = $stop . $STOP_PREFIX . "eit($st); } # Create the must-list my $must = ""; foreach my $st (@{$mandates{$lang}}) { $must = $must . $MUST_PREFIX . "eit($st); } my $query = $MUST_PREFIX . "eit(qq{$alias $synonym}); my $url = $SEARCH_PREFIX . $query . $must . $stop . $SEARCH_SUFFIX; my $g1; my $raw = &cached_get($url); $av_tried++; if ($raw >= 0) { $count{$lang}{$quality} += $raw; if (lc("$alias $synonym") ne lc("$lang $quality")) { print STDERR "$alias $synonym (alias for: $lang $quality): $raw\n"; } else { print STDERR "$alias $synonym: $raw\n"; } if ($permlog{$alias}) { my $idx = (($permlog{$alias}-1) * 3) + $pqidx{$synonym}; $pqarr[$idx] = $raw; } } else { $av_bad++; } } if ($count{$lang}{$quality} > $greatest) { $greatest = $count{$lang}{$quality}; $pop = $lang; } if (($count{$lang}{$quality} > 1) && ($count{$lang}{$quality} < $least)) { $least = $count{$lang}{$quality}; } } # End of foreach my $quality ('sucks', 'rules') } # End of foreach my $alias (@{$aliases{$lang}}) print STDERR "\n"; } # end of foreach my $lang (sort keys(%aliases)) if ($av_bad) { print STDERR "lang_srom: Failed $av_bad loads out of $av_tried attempts\n"; } else { print STDERR "lang_srom: Success on all $av_tried attempts\n"; } if ($greatest == 0) { die "lang_srom: Got no good pages at all"; } elsif ($av_bad > ($av_tried * 0.5)) { die "lang_srom: Too many failed loads"; } my $mhp = 0; # plot them on a crude ASCII chart my $vscale = 3.0; # Number of lines that equals one power of e (see chartrows # formula below) my $hlabel = "sucks . . . . . . . debatable . . . . . . . rules"; my $hscale = length($hlabel); my $chartrows = int($vscale * log($greatest + $greatest)); # print "hscale $hscale chartrows $chartrows\n"; my @chart; for (my $i=0; $i <= $chartrows; $i++) { $chart[$i] = " " x $hscale; } my $rarest_line = 0; my $rulezy = 0; my $ruleslang = ""; foreach my $lang (sort(keys(%aliases))) { my $rules = $count{$lang}{'rules'}; my $sucks = $count{$lang}{'sucks'}; my $thits = $rules + $sucks; if ($thits > 10) { # compute horizontal and vertical positions my $hpos = int(($hscale - 1) * $rules / ($rules + $sucks)); # To qualify as a rulzy record-setter, a language must be in the top 75% # of the vertical scale. my $cutoff = exp(log($greatest * $least * $least * $least) / 4.0); # print "greatest $greatest cutoff $cutoff least $least\n"; # Keep track of which language is the rulziest if (($hpos > $rulezy) && ($thits > $cutoff)) { # New record-setter $rulezy = $hpos; $ruleslang = $lang; # print "$ruleslang set record rulezy=$rulezy\n"; } my $vpos = $chartrows - int($vscale * log($rules + $sucks)); my $row = $chart[$vpos]; # Avoid plotting one symbol over another while (substr($row, $hpos * 2, 2) ne " ") { $hpos++; } # print "lang $lang tag $tag{$lang} rules $rules sucks $sucks thits $thits hpos $hpos vpos $vpos\n"; substr($row, $hpos * 2, 2) = $tag{$lang}; $chart[$vpos] = $row; if ($vpos > $rarest_line) { $rarest_line = $vpos; } if ($thits > $maxhits) { $maxhits = $thits; $maxhwhat = $lang; } } } sub encspc { my($l) = @_; $l =~ s/ / /g; $l =~ s/`~/ /g; return $l; } # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # OUTPUT FILE GENERATION # # The output file is in an ASCII markup language called "RHTF". Users of # WikiText will recognize this style of markup language, although the actual # markup symbols in RHTF are somewhat different. open (OUT,"> lang_srom.rhtf"); print OUT "Programming Languages: The Internet's Current Opinion tcc[Programming Languages: The Internet's Current Opinion] gr[Java |JavaTM] THIS FILE IS AUTO-GENERATED BY lang_srom.per "; print OUT &encspc("
total |$hlabel|
hits +" . ("-" x $hscale) . "+
"); my $i = 0; my $hits = 0; my $ht2 = ""; for ($i = 0; ($i <= $chartrows) && ($i <= $rarest_line); $i++) { # compute numeric value for label. The +0.5 makes the computed # value be centered on the values that are plotted in its row. $hits = exp((0.5 + $chartrows - $i) / $vscale); if ($hits > 10) { $hits = int($hits); } elsif ($hits > 1) { $hits = (int($hits * 10)) / 10; } else { $hits = "1.00"; } $hits = &fmt_hits($hits, $i, $chartrows); $ht2 = $hits; $hits .= ",|" . $chart[$i] . "|
\n"; # print(sprintf("%2d: %s", $i, $hits)); # place color labels around any tags in this line $hits =~ s/([A-Z][a-z])/&maptag($1)/ge; $hits =~ s/ / /g; $hits =~ s/,/ /g; print OUT &encspc("$hits"); } $ht2 =~ s/,/ /g; $hits = (($i % 4) == 1) ? " " : $ht2; print OUT &encspc("$hits +" . ("-" x $hscale) . "+

"); # Show the legend. $i = 0; foreach my $lang (sort {lc($a) cmp lc($b)} (keys(%aliases))) { my $t = $tag{$lang}; if ($t ne ".") { print OUT &encspc("
\n ") if ($i % 3 == 0); $t = substr(" $t = $lang ", 1, 20); $t =~ s/ ([A-Z][a-z]) =/" " . &maptag($1) . " ="/ge; print OUT &encspc($t); $i++; } } my $date = scalar(gmtime(time())); print OUT "

Updated $date GMT. (s$servid)
"; print OUT "The numbers come from Yahoo queriesfn[3]. For example, a search for |+\"$myambiv sucks\"| will reveal that there are " . $count{$myambiv}{'sucks'} . " web pages containing the text '$myambiv sucks'fn[1]. That number, along with the results of queries for '$myambiv rocks' and '$myambiv rules', is used to plot the position of the symbol '" . &encspc(&maptag($tag{$myambiv})) . "' on the chart. (As you can see, this paragraph rates $myambiv ambiguously -- and that reflects rather accurately how I feel about $myambiv :-) The chart is updated dailyfn[2]. "; print OUT "Languages that appear *higher* on the chart (like $pop) appear on a greater total number of web pages. Languages that appear closer to the *right* side of the chart (like $ruleslang) have a greater 'rules/sucks' ratio -- that is, they appear more often with 'rules' than with 'sucks'. However, you should keep in mind that languages near the bottom of the chart are not mentioned on many web pages, so their horizontal position isn't as accurate an indicator of their true karma. You might notice that there aren't many *popular* languages that *suck* -- that is, there are not many languages near the upper-left corner of the chart. This confirms the theory that really sucky languages never become widespread enough to be mentioned on lots of web pages. On the other hand, there are plenty of *unpopular* languages that *rule* (these appear in the bottom-right portion of the chart). In most cases, these are specialized languages -- they do some limited job really well, but haven't become popular because they are limited to specific types of tasks. In rare cases, these are new languages that will someday rule the world but currently only rule their early adopters. Regarding the data collection: As already mentioned, these data points come from search engine results. Some languages, such as Python, require the use of *stop words* (such as 'monty') to prevent unrealistic results from being plotted. The languages C and C++ both appear on the chart as '" . &encspc(&maptag($tag{"C and C++"})) . "' because their names make it impossible to search for one without finding the other. Other languages are missing simply because I have overlooked them or do not consider them important. Suggestions are welcome, but will not always be accepted. In particular, I consider particular brands of a language to be insignificant except in cases (like Maple and AppleScript) where the brand *is* the language. I also don't care much about specialized languages (PHP is an example, but it rules so much I couldn't restrain myself :-) NOTE: The statistics for *Basic* actually consist almost entirely of references to Visual Basic. Since there are also a couple other Basics, I decided to lump them all together and call it just \"Basic\". NOTE: Prolog, REXX and Haskell are not included because the word *rules* has a special meaning in those languages (as a plural noun), making the 'rules' counts meaningless. The vertical scale, which shows the number of hits, is logarithmic. Over the lifetime of this web page (which went live in 2000) the top of this scale has increased from 13359 to its present value. (the old URL was http://home.earthlink.net/~mrob/pub/lang_srom.html ) "; print OUT "This page is inspired by the awesome [Operating System Sucks-Rules-O-Meter|http://srom.zgp.org/] which you are encouraged to visit if you like this sort of thing. I also have a certain fondness for now-defunct [Tool of Objective Truth|http://www.zdnet.co.uk/athome/misc/toot/] formerly featured at ZDNet UK. "; print OUT "If you are running Linux (Linux rules) and know Perl (Perl rules even more"; print OUT " than $ruleslang" if ($ruleslang ne "Perl"); print OUT ") you might be interested in the [source``code|lang_srom.txt] for the program that generates this page. Note in particular how it recomputes parts of the explanatory text to match the chart. The output it generates is in RHTF (Readable HyperText Format), part of my automated web authoring system. ``` <\$: Raw data from Yahoo Search queries: "; # This is the old table format, before I got the scatter chart working. my $colmod = 0; # We want a "-\n" the first time my @tuples = (); print OUT "tho[ border=0 cellpadding=0 cellspacing=0]\n\n"; my $bigspc = "`" x 27; my $i = "\$*Language*\$ | ``` | \$*Sucks*\$ | ``` | \$*Rules* or *Rocks*\$"; print OUT " $bigspc | $i | $bigspc | $i\n"; foreach my $lang (sort {lc($a) cmp lc($b)} (keys(%aliases))) { my $sucks = $count{$lang}{'sucks'}; my $rules = $count{$lang}{'rules'}; my $suckage = int (100* $sucks/$greatest); my $suck_offset = 100 - $suckage; my $ruleage = int (100* $rules/$greatest); my $rule_offset = 100 - $ruleage; if ($lang ne 'zzz_end_flag') { $tuples[$colmod] = "\$$lang\$ | | \$$sucks\$ | | \$$rules\$"; $colmod++; } } my $nrows = int(($colmod+1)/2); for($i=0; $i<$nrows; $i++) { print OUT " -\n | $tuples[$i] | | $tuples[$i+$nrows]\n"; } print OUT "
\n"; print OUT " :> ``` ---- #Footnotes# fnd[1]NOTE: On the first results page, the number of results might be in the millions. This number is a crude estimate and is often completely wrong. But if you proceed to go through the results page by page, the number of results will eventually be computed 'for real'. The |lang_srom| script gets a more accurate count in a single query by requesting the 1000^{th} results page. This query coerces the server to figure out how many matches there arctually are. fnd[2]Although the chart is updated daily, most of the queries are performed less often and cached to reduce impact on the search engine. fnd[3]Before Yahoo, it used AltaVista. You can read about AltaVista'a history in [digital.com's``article``about``AltaVista|https://digital.com/about/altavista/] by Claire Broadley. ---- (-The Programming Languages Internet Opinion Chart is older than Google.-) ` (-It is based on a similar page for operating systems created by Don Marti.-) ` (-Both were made possible by-) ` (-pb[AltaVista{(r)} logo|+altavista-1996.jpg]-) ` (-and the present implementation is powered by [Yahoo!``Search|https://search.yahoo.com/].-) <\$: Yahoo! Search{(r)} is a registered trademark of the Yahoo Company. This site is not endorsed by, sponsored by, or affiliated with Yahoo. AltaVista{(r)} and the AltaVista{(r)} Logo were trademarks of AltaVista{(r)} Company, and are now owned by Yahoo. AppleScript{(r)} is a registered trademark of Apple Computer, Inc. Java and JavaScript{(r)} are registered trademarks of Sun Microsystems. Linux{(r)} is a registered trademark of Linus Torvalds. Maple{(r)} is a registered trademark of Waterloo Maple, Inc. PostScript{(r)} is a registered trademark of Adobe Systems, Inc. Visual Basic{(r)} is a registered trademark of Microsoft Corporation. ZDNet is a trademark of Ziff-Davis Publishing Company. :> "; close OUT; # update public copy of myself system("cp lang_srom.per lang_srom.txt"); # log max hits (this is to measure size of the Internet) chdir; open(OUT, ">> var/lang_srom_log"); print OUT "$date (m$servid) [$maxhits $maxhwhat]"; print OUT (" " . ($pqarr[0]+0) . " " . ($pqarr[1]+0) . " " . ($pqarr[2]+0) . " " . ($pqarr[3]+0) . " " . ($pqarr[4]+0) . " " . ($pqarr[5]+0) . "\n"); close OUT; # This causes my web-authoring software to notice that something has changed, # which makes it compile the RHTF into HTML and upload to the web server. system("touch ."); exit 0;