Eza's Image Glutton のソースコード

Eza's Image Glutton

Redirects to high-res images on gallery sites, skipping past descriptions and comments
2014/12/16のページです。最新版はこちら
右端で折り返す
// ==UserScript==
// @name        Eza's Image Glutton
// @namespace   https://inkbunny.net/ezalias
// @author			Ezalias
// @description Redirects to high-res images on gallery sites, skipping past descriptions and comments
// @license     Public domain / No rights reserved
// @include     /^https*://www\.furaffinity\.net/(view|full)/.*/
// @include     https://inkbunny.net/submissionview.php*
// @include     http://gelbooru.com/*page=post&s=view*
// @include     http://youhate.us/*page=post&s=view*
// @include     http://www.gelbooru.com/*s=view*
// @include     http://danbooru.donmai.us/posts/*
// @include     https://danbooru.donmai.us/posts/*
// @include     /^http(s|)://e(621|926)\.net/post/show*//
// @include     http://*.deviantart.com/art/*
// @include     http://*.tumblr.com/*
// @include     http://*.hentai-foundry.com/pictures/*
// @include     /^https*://www\.sofurry\.com/view/*//
// @include     https://www.weasyl.com/*
// @include     http://www.y-gallery.net/view/*
// @include     http://rule34.paheal.net/post/view/*
// @include     http://rule34.xxx/index.php?page=post*
// @include     http://rule34hentai.net/post/view/*
// @include     /^https*://derpiboo.ru/*//
// @include     /^https*://derpibooru.org/.*/
// @include     http://*.booru.org/*s=view*
// @include     http://mspabooru.com/*s=view*
// @include     http://safebooru.org/*s=view*
// @include     http://www.majhost.com/cgi-bin/gallery.cgi?i=*
// @include     http://g.e-hentai.org/s/*
// @include     http://nijie.info/view.php?id=*
// @include     http://metabooru.com/*
// @include     http://www.pixiv.net/member_illust.php?mode=medium&illust_id=*
// @include     http://sleepymaid.com/gallery/displayimage.php?*
// @include     https://*.sankakucomplex.com/post/*
// @include     http://*.bronibooru.com/posts/*
// @exclude    http://www.deviantart.com/users/outgoing?*
// @exclude    *#dnr
// @version     1.22.2 
// ==/UserScript==



// Any single-image submission will redirect to the full-size image. On multi-image submissions, every page except the first will redirect to its full-size image. 
// If you go "back" to the normal gallery page (to favorite the image, read its description, leave a comment, etc.) then this script will not send you forward again. 
// https://greasyfork.org/scripts/4713-eza-s-image-glutton
// http://userscripts-mirror.org/scripts/show/169968 (waaay out of date)

// TO DO: 
// consider doing away with 'og:' stuff for scrape_tumblr, now that redirects don't fluff up the number of pages you go Back through. consistency and simplicity have their value. 
// for modify_tumblr: for photoset pages (but everywhere, to be safe) make unlinked images link to themselves. I want nice, clean, chronological tabs for multi-image comics. 
// ugh. test without adblock enabled. 
// modify_furaffinity to change prev/next/fav links with pre-appended #dnr. not raw html fiddling: use the DOM and getElementsByType or whatever. thingy.href=url_plus_dnr. 
// flickr? maybe separately. that whole site is a mess. also full-size images are sometimes gigantic, like dozens of megabytes. 
// inkbunny: move page links above first-page preview on multi-image submissions? - by altering divs or css, if possible. minimal molestation implies better future-proofing. 
	// weasyl: replace bespoke thumbnails with smallest preview images? eh, do these as separate scripts, once userscripts stops fucking around.
// Consider changing some @includes to @match. 
// What the hell is @grant? 
// Consider refactoring all Eza's scripts to create and destroy fewer variables. Garbage collection might be why these scripts are rough on CPU use. 
	// Is there some way to tell JS to ditch all variables? There are points in all my scripts where they're 100% finished and can mark all memory as disposable. 
	// Maybe change text-scrape functions to grab a large but fixed-size block of text. Or ditch html_dump entirely and use soft-scrape functions that treat document.etc as a const of absurd size. 
	// There's a 'delete' operator, but it only removes properties from objects. So maybe I can do var thing = new Object; thing.stuff = longstring; ... delete thing.stuff. 
	// I mention this because I'm not considering grabbing Pixiv pages to ensure I have full-size images with a single redirect. 
// http://thehentaiworld.com/hentai-doujinshi/theres-something-about-sakura-naruto/ ? I already do rule34; there's no pretending this is just about "art." 
	// Almost deserves a more Pixiv Fixiv-like fix. Maybe just a link dump like that DeviantArt gallery script?
// Swagster.com? Eh. The name alone rubs me the wrong way. Ugh, and they watermark. 
// Nijie.info support might be missing out on multipage submissions? 
// Metabooru is either blocking the US or went down completely three months ago. Leave it in for now, just in case. 
// add youhate.us to image glutton - it's literally gelbooru again - 
// might've backed the wrong horse with greasyfork. openuserjs seems much livelier. mirror to there, at least. I don't want to switch over after having declared that 'image glutton has a new home.' (incidentally, "image glutton" in quotes already returns the greasyfork page.) - https://openuserjs.org/user/add/scripts - 

// Since I'm just leafing through HTML (usually), can I jump to the image /before/ trying to load the page? GreaseMonkey has a wonky option for running the script before the page runs, but I don't think we get all the HTML first. Maybe... maybe AJAX the page we're on? Like, @RunAtStart or whatever, then create a little blank page, then grab the URL via XmlHTTPgetObject or whatever, then read the HTML as responseText. The trouble (I expect) would be going back to the normal page when someone hits 'back.' This script shouldn't run... but any browser will probably have cached the fake page. 

// Owyn Tyler has a ridiculously replete script with similar goals called Handy Just Image - http://userscripts.org/scripts/show/166494
// The supported-site list is waaay longer than mine, and/but his goals are more complex. Image Glutton exists only to deliver the image. 
// He's having trouble with back-trapping, though. His solution sounds absurdly complex even compared to mine. Test the script and recommend help if possible. 






// global variables, for simplicity
var image_url = '';		// location of the full-size image to redirect to
var wait_for_dnr = false;		// some site URLs use "#" liberally, so if this var isn't empty, only "#dnr" will stop a redirect



// detect site, extract image URL, then decide whether or not to redirect
	////////// 		Simple extract_image_url_after sites
if ( address_bar_contains('e621.net') ) { extract_image_url_after( '>Respond</a>', 'https://' ); }
else if ( address_bar_contains('e926.net') ) { extract_image_url_after( '<li>Size:', 'http' ); }
else if ( address_bar_contains('weasyl.com') ) { extract_image_url_after( '<div id="detail-art">', '/' ); }		// also redirects to plaintext/HTML on stories, haha
else if ( address_bar_contains('hentai-foundry.com') ) { extract_image_url_after( '<center><img', '//' ); }
else if ( address_bar_contains('y-gallery.net') ) { extract_image_url_after( 'a_center container2">', 'http://' ); }
else if ( address_bar_contains('rule34.xxx') ) { extract_image_url_after( '>Edit</a></li>', 'http://' ); }
else if ( address_bar_contains('derpiboo.ru') ) { extract_image_url_after( 'full res">View</a>', '//derpicdn' ); }
else if ( address_bar_contains('derpibooru.org') ) { extract_image_url_after( 'full res">View</a>', '//derpicdn' ); }
else if ( address_bar_contains('metabooru.com') ) { extract_image_url_after( 'og:image', 'http://' ); }
else if ( address_bar_contains('sankakucomplex.com' ) ) { extract_image_url_after( '<li>Original:', '//' ); } 
	////////// 		Slightly complicated extract_image_url_after sites
else if ( address_bar_contains('rule34hentai.net') ) { extract_image_url_after( 'shm-zoomer', '/_images/' ); wait_for_dnr = true; }
else if ( address_bar_contains('rule34.paheal.net') ) { extract_image_url_after( 'shm-zoomer', 'http://' ); wait_for_dnr = true; }
else if ( address_bar_contains('majhost.com') ) { image_url = document.getElementsByTagName( "img" )[0].src; }		// first and only <img> tag
	////////// 		Simple custom sites
else if ( address_bar_contains('sofurry.com') ) {
	image_url = window.location.href.replace('sofurry.com/view/','sofurryfiles.com/std/content?page='); 
	if( document.body.outerHTML.indexOf( '<div id="sfContentImage' ) < 0 ) { image_url = ''; } 		// Do not redirect from stories
	if( document.body.outerHTML.indexOf( '<div class="sf-story"' ) > 0 ) { image_url = ''; } } 		// Really do not redirect from stories
else if ( address_bar_contains('danbooru.donmai.us') ) { 
	extract_image_url_after( '% of original (', '/data/' );		// resized images will say "X% of original (view full" or something like that
	if( image_url == '' ) {extract_image_url_after( 'twitter:image:src', 'http://' );		// otherwise just grab the preview-sized image (this also works on pages claiming you need Gold to see them)
	image_url = image_url.replace( '/sample/sample-', '/' ); }	 }	// if the preview-sized image is a sample, fix that - this sometimes fails for PNG images with JPG previews
else if ( address_bar_contains('furaffinity.net') ) {
	if (unsafeWindow.full_url)			// Basically stolen from http://userscripts.org/scripts/review/157574 - but FA's kind enough to define the URL as a var, so why fight the obvious approach? 
		{ image_url = unsafeWindow.full_url; } }		// use full_url variable from live window HTML
else if ( address_bar_contains('http://g.e-hentai.org/s/') ) { 
	var image_index = document.body.outerHTML.indexOf( '</iframe>' );		// jump to end of navigation iframe
	image_index = document.body.outerHTML.indexOf( 'http://', image_index+1 );		// find next URL (link to next page)
	image_index = document.body.outerHTML.indexOf( 'http://', image_index+1 );		// find URL after that (image source)
	image_url = document.body.outerHTML.substring( image_index, document.body.outerHTML.indexOf( '"', image_index ) ); }		// grab image src, delimited by doublequote 
else if ( address_bar_contains('nijie.info') ) {
	extract_image_url_after( 'name="twitter:image"', 'http://' );		// some images are behind some sort of barrier, so let's grab the twitter-size image instead...
	image_url = image_url.replace( '/sp/', '/' ); }		// ... and drop the /sp/ to get the full-size URL. 
else if ( address_bar_contains('sleepymaid.com') ) { 
	extract_image_url_after( 'fullsize', 'albums/' ); 
	image_url = image_url.replace( 'normal_', '' ); 
	wait_for_dnr = true; 
}
	////////// 		Sites complex enough to shove into a function down below 
else if ( address_bar_contains( 'deviantart.com' ) ) { scrape_deviantart(); wait_for_dnr = true; }
else if ( address_bar_contains( 'inkbunny.net' ) ) { scrape_inkbunny(); }
else if ( address_bar_contains( 'tumblr.com' ) ) { scrape_tumblr(); }
else if ( address_bar_contains( 'pixiv.net' ) ) { scrape_pixiv(); }
else if ( address_bar_contains( 'gelbooru.com' ) ) { scrape_booru(); } 
else if ( address_bar_contains( 'youhate.us' ) ) { scrape_booru(); } 
else if ( address_bar_contains( '.booru.org' ) ) { scrape_booru(); } 
else if ( address_bar_contains( 'mspabooru.com' ) ) { scrape_booru(); } 
else if ( address_bar_contains( 'safebooru.org' ) ) { scrape_booru(); } 
else if ( address_bar_contains( 'bronibooru.com' ) ) { scrape_booru(); } 



// having defined image_url by scraping the page's HTML, modify the current URL to prevent back-traps, then redirect to that full image 
if( image_url !== '' && (!address_bar_contains('#') || wait_for_dnr) ) 		// do nothing if image_url is empty. ignore pages with a "#", unless wait_for_dnr makes you wait for a full "#dnr". 
{
		// some images don't redirect properly, even if you manually "view image" - so we append ".jpg" to URLs without file extensions, forcing the browser to consider them images
		// even if this doesn't work, the new URL should just 404, which is better than the semi-modal "octet stream" dialog seen otherwise. 
	if( image_url.lastIndexOf( '/' ) > image_url.lastIndexOf( '.' ) ) { image_url = image_url + '.jpg'; }		// if there's not a "." after the last "/" then slap a file extension on there 
	if( image_url[ image_url.length - 1 ] == '.' ) { image_url = image_url + 'jpg'; }		// if the URL ends with a dot, slap a file extension on there 

		// modify current location, so that when the user clicks "back," they aren't immediately sent forward again
	modified_url = window.location.href + '#dnr'; 		// add do-not-redirect tag to current URL
	history.replaceState( {foo:'bar'}, 'Do-not-redirect version', modified_url);		// modify URL without redirecting. the {foo:'bar'} thing is a state object that I don't care about, but the function needs one.

//	window.location.href = image_url;		// redirect to full image
	location.assign("javascript:window.location.href=\""+image_url+"\";");		// pixiv-friendly redirect to full image: maintains referral, happens within document's scope instead of within greasemonkey's
}		// end of main execution





// ----- //			Functions for readability





function extract_image_url_after( string_before_url, url_begins_with ) {		// extract the first quote-delimited string that appears after unique first var and begins with second var
	var html_elements = document.getElementsByTagName('html'); 		// this way we avoiding doing getElementsEtc every time, and we still access the whole page's HTML by reference
	var string_index = html_elements[0].innerHTML.indexOf( string_before_url ); 		// find a unique string somewhere before the image URL
	if( string_index > -1 ) {
		var image_index = html_elements[0].innerHTML.indexOf( url_begins_with, string_index );  		// find where the image URL starts after the unique string
		var delimiter_index = html_elements[0].innerHTML.indexOf( '"', image_index ); 		// find first doublequote after the image URL starts
		image_url = html_elements[0].innerHTML.substring( image_index, delimiter_index ); 		// grab the image URL up to the next doublequote 
	}
}

function address_bar_contains( string_to_look_for ) {	// I'm so tired of typing out window.location.etc == -1. It's stupidly verbose and it looks terrible.
	return (window.location.href.indexOf( string_to_look_for ) !== -1);		// this makes code more concise and readable. if( address_bar_contains( 'tld.com' ) ) { do tld.com stuff; }
}





// ----- //			Functions for individual websites (separated for being especially long)





function scrape_tumblr() {
	// Tumblr's goals are basically like Inkbunny's:
	//		- On a generic post or /image/ page with a single image: redirect to that image in the highest resolution available
	//		- On a multi-image post: do nothing, since photosets already are or link to their highest-resolution versions
	if( address_bar_contains('/image/') ) { 		// on centered-image pages
		extract_image_url_after( 'id="content-image"', 'http://' );		// get conveniently-labeled id="image" image
	}		// The above will handle /post/ to /image/ URL conversions in a sort of double-redirect, letting Tumblr do the hard work of finding the highest-res version of an image
	else if( address_bar_contains( '/post/' ) ) {		// on generic Tumblr posts
		image_url = window.location.href.replace( '/post/', '/image/' );  
		var comment_check_index = image_url.lastIndexOf( '/' );		// if the /post/ contained text, it gets appended after the post/image number and screws up the URL...
		if( image_url.substring( comment_check_index - 6, comment_check_index ) !== '/image' ) {		// ... so if the last '/' isn't the latter in '/image/' then we dump everything after that '/'...
			image_url = image_url.substring( 0, comment_check_index );		// ... by taking the substring up to the index of that final '/'.
		}
			// If the theme is kind enough to use proper Open Graph tags, let's use those instead for a single redirect:
		var post_image_url = image_url;		// store the double-redirect URL just in case
		extract_image_url_after( 'property="og:image"', 'http' );
		if( image_url.indexOf( '_1280.' ) == -1 ) { image_url = post_image_url; }		// if the Open Graph isn't _1280 (and thus might be misdefined at low-res), just double-redirect instead  
	}		// this might also trigger on images with no _size in the URL, but I think those are all tumblr-feed:entry items anyway. hardly matters. /image/ would still work. 

		// Now that image_url is defined, we can blank it out if we don't want to redirect. Much easier than piling on if( || && || )-style logic. 
	if( address_bar_contains( '_iframe/' ) ) { image_url = ''; }		// Do not redirect from photoset iframe pages, since they trigger their own instance of this script
	if( document.body.outerHTML.indexOf( 'class="html_photoset"' ) !== -1 ) { image_url = ''; }		// Do not redirect from photosets (because photoset images always are or link to highest-res versions)
	if( document.head.outerHTML.indexOf( 'content="tumblr-feed:entry"' ) !== -1 ) { image_url = ''; }	// Do not redirect if Open Graph indicates a text-only post (as opposed to tumblr-feed:photo). 
	if( document.head.outerHTML.indexOf( 'content="tumblr-feed:photoset""' ) !== -1 ) { image_url = ''; }	// Do not redirect if Open Graph indicates a text-only post (as opposed to tumblr-feed:photo). 
}

// DeviantArt's HTML is an inconsistent mess. 
// maybe use the tumblr share thing? except I'd have to unescape a bunch of $2F%E6 nonsense. 
// dev-page-download? no. that's just more custom-view horseshit. 
// collect_rid? seems to be based on URL, e.g. collect_rid="1:484295327". 
function scrape_deviantart() {		// this doesn't use ditch_html_before because data-super-full-img's appear for random links - we need to avoid grabbing one from the ass-end of small-image pages
	var image_index = document.body.outerHTML.indexOf( 'class="dev-view-deviation"' ); 		// jump to unique and hopefully universal dev-view-deviation div
	if( image_index > 0 ) { 		// Don't redirect on pages without a deviation (e.g. "oops not found" faux-404s). 
		var image_index = document.body.outerHTML.indexOf( 'src=', image_index+1 ); 		// jump to first src (preview size)
		var image_index = document.body.outerHTML.indexOf( 'http://', image_index+1 ); 		// jump to the URL defined in src
		image_url = document.body.outerHTML.substring( image_index, document.body.outerHTML.indexOf( '"', image_index ) ); 		// grab URL, delimited by doublequote
		if( image_url.indexOf( "/PRE/" ) > 0 ) { image_url = image_url.replace( "/PRE/", "/" ); } 		// fix preview-size image to be full-size
	} 
	if( document.body.outerHTML.indexOf( '<div id="flashed-in"' ) > 0 ) { image_url = ''; } 		// Do not redirect on flash pages 
}

function scrape_inkbunny() {
	var image_index = document.body.outerHTML.indexOf( 'https://inkbunny.net/files/screen/' );		// look for screen-size image URL 
	if( image_index !== -1 )		// if that URL is found
	{
		var delimiter_index = document.body.outerHTML.indexOf( '"', image_index );		// find first doublequote delimiter after URL
		image_url = document.body.outerHTML.substring( image_index, delimiter_index );		// grab delimited URL 
		image_url = image_url.replace( '/screen/', '/full/' );		// turn screen URL into full URL - we don't care if /screen/ is already full-size, because /full/ will kindly redirect anyway
	}

	// if this page is the landing page for a multi-image submission, do not redirect 
	if ( document.body.outerHTML.indexOf( '<form id="changethumboriginal_form"' ) !== -1 && !address_bar_contains( '&page=' ) ) {		// look for language-agnostic 'show custom thumbnails' button
		image_url = '';		// note: we do redirect on URLs for individual pages, including the first. 
	}
}

function scrape_pixiv() { 
	extract_image_url_after( 'class="_illust_modal', 'http://' ); 		// Oh, what now? Code below doesn't work for some pages, so do this instead. (This goes first because only the last successful 'extraction' matters.) 
	extract_image_url_after( 'class="big"', 'http://' ); 		// New Pixiv pages (Dec '14) provide the big URL rather directly. 

	// Through sheer accident, the old manga code still works after Pixiv's latest change. 
	if( image_url == '') { 		// if there's no 'big' link and thus no image was grabbed, it's probably manga
		image_url = window.location.href.replace( 'mode=medium', 'mode=manga' );		// manga pages deserve their own HTML, so just go to that page 
		// Users: please consider Eza's Pixiv Fixiv, which replaces the default manga HTML with full images and none of that scroll-to-load nonsense. 
	} 

	// Don't redirect to "Ugoira" animations (ZIP full of JPGs, played as HTML slideshow) 
	if( document.getElementsByTagName('html')[0].innerHTML.indexOf( 'class="_ugoku' ) > 0 ) { image_url = ''; } 		// A little messy since we ditched html_copy, isn't it? 
}

function scrape_booru() {		// this works on a wide variety of booru-style imageboards. 
	extract_image_url_after( '>Resize image</a>', 'http://' );		// for booru's which have automatic resizing and images which require it
	if( image_url == '' ) {		// otherwise, use the image that's being displayed 
		var container = document.getElementById( 'image' ); 		// Instead of lurching through raw HTML, let's just grab the display image via the DOM. 
		image_url = container.src; 		// "You think it's cool that things don't always have to be a federal fucking issue." 
	} 
}










/*
Test suite of random URLs from the relevant sites: 
http://www.hentai-foundry.com/pictures/user/Bottlesoldier/133840/Akibabuse
http://www.hentai-foundry.com/pictures/user/Bottlesoldier/214533/Lil-Gwendolyn
https://inkbunny.net/submissionview.php?id=483550
https://inkbunny.net/submissionview.php?id=374519
http://rule34.xxx/index.php?page=post&s=view&id=1399731
http://rule34.xxx/index.php?page=post&s=view&id=1415193
http://equi.booru.org/index.php?page=post&s=view&id=56940
http://furry.booru.org/index.php?page=post&s=view&id=340299
http://derpibooru.org/470074?scope=scpe80a78d33e96a29ea172a0d93e6e90b47c6a431ea
http://mspabooru.com/index.php?page=post&s=view&id=131809
http://mspabooru.com/index.php?page=post&s=view&id=131804
http://shiniez.deviantart.com/art/thanx-for-5-m-alan-in-some-heavy-makeup-XD-413414430
http://danbooru.donmai.us/posts/1250724?tags=dennou_coil
http://danbooru.donmai.us/posts/1162284?tags=dennou_coildata:text/html,<img src='http://example.com/image.jpg'>
http://www.furaffinity.net/view/12077223/
http://gamesbynick.tumblr.com/post/67039820534/the-secrets-out-guys-the-secret-is-out
http://honeyclop.tumblr.com/post/67122645946/stallion-foursome-commission-for-ciderbarrel-d
http://shubbabang.tumblr.com/post/20990300285/new-headcanon-karkat-is-ridiculously-good-at
http://www.furaffinity.net/view/12092394/
https://e621.net/post/show?md5=25385d2349ae11f2057874f0479422ad
http://sandralvv.tumblr.com/post/64933897836/how-did-varrick-get-that-film-cuz-i-want-a-copy
*/