Eza's Image Glutton

Redirects to high-res images on gallery sites, skipping past descriptions and comments
As of 2014-12-11. See the latest version.
Ask a question, post a review, or report the script.
Wrap lines
// ==UserScript==
// @name        Eza's Image Glutton
// @namespace   https://inkbunny.net/ezalias
// @author			Ezalias
// @description Redirects to high-res images on gallery sites, skipping past descriptions and comments
// @license     Public domain / No rights reserved
// @include     /^https*://www\.furaffinity\.net/(view|full)/.*/
// @include     https://inkbunny.net/submissionview.php*
// @include     http://gelbooru.com/*page=post&s=view*
// @include     http://www.gelbooru.com/*s=view*
// @include     http://danbooru.donmai.us/posts/*
// @include     https://danbooru.donmai.us/posts/*
// @include     /^http(s|)://e(621|926)\.net/post/show*//
// @include     http://*.deviantart.com/art/*
// @include     http://*.tumblr.com/*
// @include     http://*.hentai-foundry.com/pictures/*
// @include     /^https*://www\.sofurry\.com/view/*//
// @include     https://www.weasyl.com/*
// @include     http://www.y-gallery.net/view/*
// @include     http://rule34.paheal.net/post/view/*
// @include     http://rule34.xxx/index.php?page=post*
// @include     http://rule34hentai.net/post/view/*
// @include     /^https*://derpiboo.ru/*//
// @include     /^https*://derpibooru.org/.*/
// @include     http://*.booru.org/*s=view*
// @include     http://mspabooru.com/*s=view*
// @include     http://safebooru.org/*s=view*
// @include     http://www.majhost.com/cgi-bin/gallery.cgi?i=*
// @include     http://g.e-hentai.org/s/*
// @include     http://nijie.info/view.php?id=*
// @include     http://metabooru.com/*
// @include     http://www.pixiv.net/member_illust.php?mode=medium&illust_id=*
// @include     http://sleepymaid.com/gallery/displayimage.php?*
// @include     https://*.sankakucomplex.com/post/*
// @include     http://*.bronibooru.com/posts/*
// @exclude    http://www.deviantart.com/users/outgoing?*
// @exclude    *#dnr
// @version     1.21.15
// ==/UserScript==



// Any single-image submission will redirect to the full-size image. On multi-image submissions, every page except the first will redirect to its full-size image. 
// If you go "back" to the normal gallery page (to favorite the image, read its description, leave a comment, etc.) then this script will not send you forward again. 
// https://greasyfork.org/scripts/4713-eza-s-image-glutton
// http://userscripts-mirror.org/scripts/show/169968 (waaay out of date)

// TO DO: modify_pixiv and possibly modify_tumblr, to make links point to images, instead of struggling with more honest redirects
// consider doing away with 'og:' stuff for scrape_tumblr, now that redirects don't fluff up the number of pages you go Back through. consistency and simplicity have their value. 
// for modify_tumblr: for photoset pages (but everywhere, to be safe) make unlinked images link to themselves. I want nice, clean, chronological tabs for multi-image comics. 
// ugh. test without adblock enabled. 
// modify_furaffinity to change prev/next/fav links with pre-appended #dnr. not raw html fiddling: use the DOM and getElementsByType or whatever. thingy.href=url_plus_dnr. 
// flickr? maybe separately. that whole site is a mess. also full-size images are sometimes gigantic, like dozens of megabytes. 
// inkbunny: move page links above first-page preview on multi-image submissions? - by altering divs or css, if possible. minimal molestation implies better future-proofing. 
	// weasyl: replace bespoke thumbnails with smallest preview images? eh, do these as separate scripts, once userscripts stops fucking around.
// https://openuserjs.org/user/add/scripts
// http://www.pixiv.net/member_illust.php?mode=medium&illust_id=44302315#dnr is apparently a video? ah, no: it's defined as data://. wtf. 
// Consider changing some @includes to @match. 
// What the hell is @grant? 
// Pixiv isn't redirecting to full-size images. E.g. http://www.pixiv.net/member_illust.php?mode=medium&illust_id=46560793 goes to http://www.pixiv.net/member_illust.php?mode=medium&illust_id=46560793#dnr
// Consider refactoring all Eza's scripts to create and destroy fewer variables. Garbage collection might be why these scripts are rough on CPU use. 
	// Is there some way to tell JS to ditch all variables? There are points in all my scripts where they're 100% finished and can mark all memory as disposable. 
	// Maybe change text-scrape functions to grab a large but fixed-size block of text. Or ditch html_dump entirely and use soft-scrape functions that treat document.etc as a const of absurd size. 
	// There's a 'delete' operator, but it only removes properties from objects. So maybe I can do var thing = new Object; thing.stuff = longstring; ... delete thing.stuff. 
	// I mention this because I'm not considering grabbing Pixiv pages to ensure I have full-size images with a single redirect. 
// http://thehentaiworld.com/hentai-doujinshi/theres-something-about-sakura-naruto/ ? I already do rule34; there's no pretending this is just about "art." 
	// Almost deserves a more Pixiv Fixiv-like fix. Maybe just a link dump like that DeviantArt gallery script?
// Swagster.com? Eh. The name alone rubs me the wrong way. Ugh, and they watermark. 
// http://sonicrocksmysocks.deviantart.com/art/Karkat-Hug-Simulator-491167769#dnr redirects to http://www.deviantart.com/watchfeed/ ? ah, flash. 
// Nijie.info support might be missing out on multipage submissions? 
// Metabooru is either blocking the US or went down completely three months ago. Leave it in for now, just in case. 
// add http://www.bronibooru.com/

// http://www.pixiv.net/member_illust.php?id=10986873 images don't work? 
// http://www.pixiv.net/member_illust.php?mode=medium&illust_id=47375606 goes to 
// http://i3.pixiv.net/img-original/img/2014/12/03/00/09/21/47375606_p0.jpg when main image is
// http://i3.pixiv.net/img-original/img/2014/12/03/00/09/21/47375606_p0.png ... oh. This again. Ugh. 

// Owyn Tyler has a ridiculously replete script with similar goals called Handy Just Image - http://userscripts.org/scripts/show/166494
// The supported-site list is waaay longer than mine, and/but his goals are more complex. Image Glutton exists only to deliver the image. 
// He's having trouble with back-trapping, though. His solution sounds absurdly complex even compared to mine. Test the script and recommend help if possible. 






// global variables, for simplicity
var image_url = '';		// location of the full-size image to redirect to
var wait_for_dnr = false;		// some site URLs use "#" liberally, so if this var isn't empty, only "#dnr" will stop a redirect



// detect site, extract image URL, then decide whether or not to redirect
	////////// 		Simple extract_image_url_after sites
if ( address_bar_contains('e621.net') ) { extract_image_url_after( '>Respond</a>', 'https://' ); }
else if ( address_bar_contains('e926.net') ) { extract_image_url_after( '<li>Size:', 'http' ); }
else if ( address_bar_contains('weasyl.com') ) { extract_image_url_after( '<div id="detail-art">', '/' ); }		// also redirects to plaintext/HTML on stories, haha
else if ( address_bar_contains('hentai-foundry.com') ) { extract_image_url_after( '<center><img', '//' ); }
else if ( address_bar_contains('y-gallery.net') ) { extract_image_url_after( 'a_center container2">', 'http://' ); }
else if ( address_bar_contains('rule34.xxx') ) { extract_image_url_after( '>Edit</a></li>', 'http://' ); }
else if ( address_bar_contains('derpiboo.ru') ) { extract_image_url_after( 'full res">View</a>', '//derpicdn' ); }
else if ( address_bar_contains('derpibooru.org') ) { extract_image_url_after( 'full res">View</a>', '//derpicdn' ); }
else if ( address_bar_contains('metabooru.com') ) { extract_image_url_after( 'og:image', 'http://' ); }
else if ( address_bar_contains('sankakucomplex.com' ) ) { extract_image_url_after( '<li>Original:', '//' ); } 
	////////// 		Slightly complicated extract_image_url_after sites
else if ( address_bar_contains('rule34hentai.net') ) { extract_image_url_after( 'shm-zoomer', '/_images/' ); wait_for_dnr = true; }
else if ( address_bar_contains('rule34.paheal.net') ) { extract_image_url_after( 'shm-zoomer', 'http://' ); wait_for_dnr = true; }
else if ( address_bar_contains('majhost.com') ) { image_url = document.getElementsByTagName( "img" )[0].src; }		// first and only <img> tag
	////////// 		Simple custom sites
else if ( address_bar_contains('sofurry.com') ) {
	image_url = window.location.href.replace('sofurry.com/view/','sofurryfiles.com/std/content?page='); 
	if( document.body.outerHTML.indexOf( '<div id="sfContentImage' ) < 0 ) { image_url = ''; } 		// Do not redirect from stories
	if( document.body.outerHTML.indexOf( '<div class="sf-story"' ) > 0 ) { image_url = ''; } } 		// Really do not redirect from stories
else if ( address_bar_contains('danbooru.donmai.us') ) { 
	extract_image_url_after( '% of original (', '/data/' );		// resized images will say "X% of original (view full" or something like that
	if( image_url == '' ) {extract_image_url_after( 'twitter:image:src', 'http://' );		// otherwise just grab the preview-sized image (this also works on pages claiming you need Gold to see them)
	image_url = image_url.replace( '/sample/sample-', '/' ); }	 }	// if the preview-sized image is a sample, fix that - this sometimes fails for PNG images with JPG previews
else if ( address_bar_contains('furaffinity.net') ) {
	if (unsafeWindow.full_url)			// Basically stolen from http://userscripts.org/scripts/review/157574 - but FA's kind enough to define the URL as a var, so why fight the obvious approach? 
		{ image_url = unsafeWindow.full_url; } }		// use full_url variable from live window HTML
else if ( address_bar_contains('http://g.e-hentai.org/s/') ) { 
	var image_index = document.body.outerHTML.indexOf( '</iframe>' );		// jump to end of navigation iframe
	image_index = document.body.outerHTML.indexOf( 'http://', image_index+1 );		// find next URL (link to next page)
	image_index = document.body.outerHTML.indexOf( 'http://', image_index+1 );		// find URL after that (image source)
	image_url = document.body.outerHTML.substring( image_index, document.body.outerHTML.indexOf( '"', image_index ) ); }		// grab image src, delimited by doublequote 
else if ( address_bar_contains('nijie.info') ) {
	extract_image_url_after( 'name="twitter:image"', 'http://' );		// some images are behind some sort of barrier, so let's grab the twitter-size image instead...
	image_url = image_url.replace( '/sp/', '/' ); }		// ... and drop the /sp/ to get the full-size URL. 
else if ( address_bar_contains('sleepymaid.com') ) { 
	extract_image_url_after( 'fullsize', 'albums/' ); 
	image_url = image_url.replace( 'normal_', '' ); 
	wait_for_dnr = true; 
}
	////////// 		Sites complex enough to shove into a function down below 
else if ( address_bar_contains( 'deviantart.com' ) ) { scrape_deviantart(); wait_for_dnr = true; }
else if ( address_bar_contains( 'inkbunny.net' ) ) { scrape_inkbunny(); }
else if ( address_bar_contains( 'tumblr.com' ) ) { scrape_tumblr(); }
else if ( address_bar_contains( 'pixiv.net' ) ) { scrape_pixiv(); }
else if ( address_bar_contains( 'gelbooru.com' ) ) { scrape_booru(); } 
else if ( address_bar_contains( '.booru.org' ) ) { scrape_booru(); } 
else if ( address_bar_contains( 'mspabooru.com' ) ) { scrape_booru(); } 
else if ( address_bar_contains( 'safebooru.org' ) ) { scrape_booru(); } 
else if ( address_bar_contains( 'bronibooru.com' ) ) { scrape_booru(); } 



// having defined image_url by scraping the page's HTML, modify the current URL to prevent back-traps, then redirect to that full image 
if( image_url !== '' && (!address_bar_contains('#') || wait_for_dnr) ) 		// do nothing if image_url is empty. ignore pages with a "#", unless wait_for_dnr makes you wait for a full "#dnr". 
{
		// some images don't redirect properly, even if you manually "view image" - so we append ".jpg" to URLs without file extensions, forcing the browser to consider them images
		// even if this doesn't work, the new URL should just 404, which is better than the semi-modal "octet stream" dialog seen otherwise. 
	if( image_url.lastIndexOf( '/' ) > image_url.lastIndexOf( '.' ) ) { image_url = image_url + '.jpg'; }		// if there's not a "." after the last "/" then slap a file extension on there 
	if( image_url[ image_url.length - 1 ] == '.' ) { image_url = image_url + 'jpg'; }		// if the URL ends with a dot, slap a file extension on there 

		// modify current location, so that when the user clicks "back," they aren't immediately sent forward again
	modified_url = window.location.href + '#dnr'; 		// add do-not-redirect tag to current URL
	history.replaceState( {foo:'bar'}, 'Do-not-redirect version', modified_url);		// modify URL without redirecting. the {foo:'bar'} thing is a state object that I don't care about, but the function needs one.

//	window.location.href = image_url;		// redirect to full image
	location.assign("javascript:window.location.href=\""+image_url+"\";");		// pixiv-friendly redirect to full image: maintains referral, happens within document's scope instead of within greasemonkey's
}		// end of main execution





// ----- //			Functions for readability





function extract_image_url_after( string_before_url, url_begins_with ) {		// extract the first quote-delimited string that appears after unique first var and begins with second var
	var html_elements = document.getElementsByTagName('html'); 		// this way we avoiding doing getElementsEtc every time, and we still access the whole page's HTML by reference
	var string_index = html_elements[0].innerHTML.indexOf( string_before_url ); 		// find a unique string somewhere before the image URL
	if( string_index > -1 ) {
		var image_index = html_elements[0].innerHTML.indexOf( url_begins_with, string_index );  		// find where the image URL starts after the unique string
		var delimiter_index = html_elements[0].innerHTML.indexOf( '"', image_index ); 		// find first doublequote after the image URL starts
		image_url = html_elements[0].innerHTML.substring( image_index, delimiter_index ); 		// grab the image URL up to the next doublequote 
	}
}

function address_bar_contains( string_to_look_for ) {	// I'm so tired of typing out window.location.etc == -1. It's stupidly verbose and it looks terrible.
	return (window.location.href.indexOf( string_to_look_for ) !== -1);		// this makes code more concise and readable. if( address_bar_contains( 'tld.com' ) ) { do tld.com stuff; }
}





// ----- //			Functions for individual websites (separated for being especially long)





function scrape_tumblr() {
	// Tumblr's goals are basically like Inkbunny's:
	//		- On a generic post or /image/ page with a single image: redirect to that image in the highest resolution available
	//		- On a multi-image post: do nothing, since photosets already are or link to their highest-resolution versions
	if( address_bar_contains('/image/') ) { 		// on centered-image pages
		extract_image_url_after( 'id="content-image"', 'http://' );		// get conveniently-labeled id="image" image
	}		// The above will handle /post/ to /image/ URL conversions in a sort of double-redirect, letting Tumblr do the hard work of finding the highest-res version of an image
	else if( address_bar_contains( '/post/' ) ) {		// on generic Tumblr posts
		image_url = window.location.href.replace( '/post/', '/image/' );  
		var comment_check_index = image_url.lastIndexOf( '/' );		// if the /post/ contained text, it gets appended after the post/image number and screws up the URL...
		if( image_url.substring( comment_check_index - 6, comment_check_index ) !== '/image' ) {		// ... so if the last '/' isn't the latter in '/image/' then we dump everything after that '/'...
			image_url = image_url.substring( 0, comment_check_index );		// ... by taking the substring up to the index of that final '/'.
		}
			// If the theme is kind enough to use proper Open Graph tags, let's use those instead for a single redirect:
		var post_image_url = image_url;		// store the double-redirect URL just in case
		extract_image_url_after( 'property="og:image"', 'http' );
		if( image_url.indexOf( '_1280.' ) == -1 ) { image_url = post_image_url; }		// if the Open Graph isn't _1280 (and thus might be misdefined at low-res), just double-redirect instead  
	}		// this might also trigger on images with no _size in the URL, but I think those are all tumblr-feed:entry items anyway. hardly matters. /image/ would still work. 

		// Now that image_url is defined, we can blank it out if we don't want to redirect. Much easier than piling on if( || && || )-style logic. 
	if( address_bar_contains( '_iframe/' ) ) { image_url = ''; }		// Do not redirect from photoset iframe pages, since they trigger their own instance of this script
	if( document.body.outerHTML.indexOf( 'class="html_photoset"' ) !== -1 ) { image_url = ''; }		// Do not redirect from photosets (because photoset images always are or link to highest-res versions)
	if( document.head.outerHTML.indexOf( 'content="tumblr-feed:entry"' ) !== -1 ) { image_url = ''; }	// Do not redirect if Open Graph indicates a text-only post (as opposed to tumblr-feed:photo). 
	if( document.head.outerHTML.indexOf( 'content="tumblr-feed:photoset""' ) !== -1 ) { image_url = ''; }	// Do not redirect if Open Graph indicates a text-only post (as opposed to tumblr-feed:photo). 
}

// DeviantArt's HTML is an inconsistent mess. 
// maybe use the tumblr share thing, except I'd have to unescape a bunch of $2F%E6 nonsense. 
// dev-page-download? no. that's just more custom-view horseshit. 
// collect_rid? seems to be based on URL, e.g. collect_rid="1:484295327". 
function scrape_deviantart() {		// this doesn't use ditch_html_before because data-super-full-img's appear for random links - we need to avoid grabbing one from the ass-end of small-image pages
	var image_index = document.body.outerHTML.indexOf( 'class="dev-view-deviation"' ); 		// jump to unique and hopefully universal dev-view-deviation div
	if( image_index > 0 ) { 		// Don't redirect on pages without a deviation (e.g. "oops not found" faux-404s). 
		var image_index = document.body.outerHTML.indexOf( 'src=', image_index+1 ); 		// jump to first src (preview size)
		var image_index = document.body.outerHTML.indexOf( 'http://', image_index+1 ); 		// jump to the URL defined in src
		image_url = document.body.outerHTML.substring( image_index, document.body.outerHTML.indexOf( '"', image_index ) ); 		// grab URL, delimited by doublequote
		if( image_url.indexOf( "/PRE/" ) > 0 ) { image_url = image_url.replace( "/PRE/", "/" ); } 		// fix preview-size image to be full-size
	} 
	if( document.body.outerHTML.indexOf( '<div id="flashed-in"' ) > 0 ) { image_url = ''; } 		// Do not redirect on flash pages 
}

function scrape_inkbunny() {
	var image_index = document.body.outerHTML.indexOf( 'https://inkbunny.net/files/screen/' );		// look for screen-size image URL 
	if( image_index !== -1 )		// if that URL is found
	{
		var delimiter_index = document.body.outerHTML.indexOf( '"', image_index );		// find first doublequote delimiter after URL
		image_url = document.body.outerHTML.substring( image_index, delimiter_index );		// grab delimited URL 
		image_url = image_url.replace( '/screen/', '/full/' );		// turn screen URL into full URL - we don't care if /screen/ is already full-size, because /full/ will kindly redirect anyway
	}

	// if this page is the landing page for a multi-image submission, do not redirect 
	if ( document.body.outerHTML.indexOf( '<form id="changethumboriginal_form"' ) !== -1 && !address_bar_contains( '&page=' ) ) {		// look for language-agnostic 'show custom thumbnails' button
		image_url = '';		// note: we do redirect on URLs for individual pages, including the first. 
	}
}

// Pixiv isn't redirecting to full-size images.
// http://i1.pixiv.net/img-original/img/2014/10/16/02/08/34/46574774_p0.png
// http://i1.pixiv.net./img-original/img/2014/10/16/02/08/34/46574774_p0.jpg

// What about deleted works? e.g. http://www.pixiv.net/member_illust.php?mode=medium&illust_id=46575000

// http://i1.pixiv.net/img-original/img/2014/10/16/02/35/16/46575004_p0.jpg
// http://i1.pixiv.net/img-original/img/2014/10/16/02/35/16/46575004_p0.png
// Fuck! Wrong file extension doesn't 404 or 403. it returns a blank page. I could double-redirect... ugh. 
// Maybe just detect blankness on URLs ending in .jpg or .png and swap? still a double-redirect, but only rarely. 
// @include     http://*.pixiv.net/img-original/*.jp*
// http://www.pixiv.net/member_illust.php?mode=medium&illust_id=46757104&uarea=new_illust
// http://i1.pixiv.net/c/600x600/img-master/img/2014/10/27/08/16/45/46757104_p0_master1200.jpg
// http://i1.pixiv.net/c/600x600/img-master/img/2014/10/27/08/16/45/46757104_p0_master1200.png 
function scrape_pixiv() { 
	extract_image_url_after( '_illust_modal">', 'http://' ); 		// grab preview image from "medium" landing page 
		// convert URL to full-size. 
	image_url = image_url.replace( '_m.', '.' ); 		// old style: remove _m for full-size URL
	image_url = image_url.replace( '/c/600x600', '' ); 		// new style: remove /c/600x600, swap image-master for image-original, remove _master1200. 
	image_url = image_url.replace( '/img-master/', '/img-original/' ); 
	image_url = image_url.replace( '_master1200', '' ); 

	// Through sheer accident, the old manga code still works after Pixiv's latest change. 
	if( image_url == '') { 		// if there's no 'big' link and thus no image was grabbed, it's probably manga
		image_url = window.location.href.replace( 'mode=medium', 'mode=manga' );		// manga pages deserve their own HTML, so just go to that page 
		// Users: please consider Eza's Pixiv Fixiv, which replaces the default manga HTML with full images and none of that scroll-to-load nonsense. 
	} 
}

function scrape_booru() {		// this works on a wide variety of booru-style imageboards. 
	extract_image_url_after( '>Resize image</a>', 'http://' );		// for booru's which have automatic resizing and images which require it
	if( image_url == '' ) {		// otherwise, use the image that's being displayed 
		var container = document.getElementById( 'image' ); 		// Instead of lurching through raw HTML, let's just grab the display image via the DOM. 
		image_url = container.src; 		// "You think it's cool that things don't always have to be a federal fucking issue." 
	} 
}










/*
Test suite of random URLs from the relevant sites: 
http://www.hentai-foundry.com/pictures/user/Bottlesoldier/133840/Akibabuse
http://www.hentai-foundry.com/pictures/user/Bottlesoldier/214533/Lil-Gwendolyn
https://inkbunny.net/submissionview.php?id=483550
https://inkbunny.net/submissionview.php?id=374519
http://rule34.xxx/index.php?page=post&s=view&id=1399731
http://rule34.xxx/index.php?page=post&s=view&id=1415193
http://equi.booru.org/index.php?page=post&s=view&id=56940
http://furry.booru.org/index.php?page=post&s=view&id=340299
http://derpibooru.org/470074?scope=scpe80a78d33e96a29ea172a0d93e6e90b47c6a431ea
http://mspabooru.com/index.php?page=post&s=view&id=131809
http://mspabooru.com/index.php?page=post&s=view&id=131804
http://shiniez.deviantart.com/art/thanx-for-5-m-alan-in-some-heavy-makeup-XD-413414430
http://danbooru.donmai.us/posts/1250724?tags=dennou_coil
http://danbooru.donmai.us/posts/1162284?tags=dennou_coildata:text/html,<img src='http://example.com/image.jpg'>
http://www.furaffinity.net/view/12077223/
http://gamesbynick.tumblr.com/post/67039820534/the-secrets-out-guys-the-secret-is-out
http://honeyclop.tumblr.com/post/67122645946/stallion-foursome-commission-for-ciderbarrel-d
http://shubbabang.tumblr.com/post/20990300285/new-headcanon-karkat-is-ridiculously-good-at
http://www.furaffinity.net/view/12092394/
https://e621.net/post/show?md5=25385d2349ae11f2057874f0479422ad
http://sandralvv.tumblr.com/post/64933897836/how-did-varrick-get-that-film-cuz-i-want-a-copy
*/