A humble attempt at spinning the web. For the love of Python and JavaScript.
Site served by Flask (Python) with help from JavaScript and Bootstrap.
Please leave a message in the footer!


Enjoy,
Peer Rothchild

See Progressive Web Application Demo. link





Progressive Web Application



A pul·chri·tu·di·nous /ˌpəlkrəˈt(y)o͞od(ə)nəs/ (?), installable, native-like database app with offline capability [R1:C2]





To see mobile version, type these three things:
$('#pr-video').hide();
$('#pr-video-mobile').removeClass('hide');
$('#pr-video-mobile').addClass('show');

into console. [R1:C3]




HTML / CSS
/ JS (+jQUERY)
/ PHP
/ MySQL
[indexedDB]


Service-worker.js
manifest.json
indexedDb


DataBase

indexed Db
PHP>
MySQLi>


RESTful Interactive
JavaScript Interactive
Lots OFREACT!!!





Web Designs



Built with Flask, HTML, CSS, JavaScript.




Ubuntu Server





hosted on



RESTful



jQuery + Bootstrap


Angular.JS







..
..







Watch Folders

The following is a pair of scripts in Python and XML, respectively,
whose objective is to monitor a folder and if an image file is added,
move the image to a different directory.


This is the actual Python script.
A few things need to happen during and after you make script:

  • Replace the path with your own path to monitored folder and pick a destination.
  • You must reference the path to this file in your Launch Agent script, so remember where it is.
  • There must be a 'shebang' as the first line. This tells where the script will be interpreted.
  • Lastly, we must change permissions, in order to make this program executable. This is done in the terminal with the command chmod +x /path/to/script.py


Figure 1.The Python script that will be called by the launch agent.
#!/usr/bin/env python
import os
import shutil

path = "/Users/rachelmarie/Desktop/python/hoo/"
move_to = "/Users/rachelmarie/Desktop/noo/"
files = os.listdir(path)
new = os.listdir(move_to)
files.sort()

for f in files:
    source = path + f
    destination = move_to + f
    jp = '.jpg'
    if jp in source:
    	if jp not in new:
            shutil.copy(source, destination)
    else:
    	pass


The cool thing about this project is that we'll be creating what's called a 'Launch Agent' that gets loaded when the user logs in. That's right, we won't have a Python program constantly running, but instead will use mac OS's built in 'launchd' to launch our short Python script once a specific event is triggered. It'll do the job, and close.

  • If you should need to debug this script, there is a built-in tool in mac OS, plutil, which proved very useful in troubleshooting my first property list.
    It can be accessed via the Terminal with the command plutil -lint /path/to/propertylist.plist
  • The title of this file should follow convention by using the format com.name.unique.plist, where 'name.unique' is picked by user, but keeps this format.
  • The file must be place in the correct folder. In this case, it will be located in ~/Library/LaunchAgents. '~' points to the current user's home directory.


Figure 2.The launch agent or 'property list', which is placed in the directory ~/Library/LaunchAgents and launched upon login.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	
	<key>Label</key>
	<string>com.peter.launchdtest</string>
	
	
	<key>ProgramArguments</key>
		<array>
			<string>/Users/rachelmarie/Desktop/python/startt.py</string>
		</array>
	
	
	<key>EnvironmentVariables</key>
	<dict>
		<key>PATH</key>
		<string>/Library/Frameworks/Python.framework/Versions/3.7/bin:
		/usr/bin:/bin:/usr/sbin:/sbin</string>
	</dict>
	
	
	<key>StandardOutPath</key>
	<string>/tmp/test.stdout</string>
	
	
	<key>StandardErrorPath</key>
	<string>/tmp/test.stderr</string>
	
	
	<key>WatchPaths</key>
	<array>
		<string>/Users/rachelmarie/Desktop/python/hoo</string>
	</array>
	
	
	<key>Debug</key>
	<true/>
		

</dict>
		
</plist>


When folder /Users/rachelmarie/Desktop/python/hoo/ is added to or any contents modified...



Amazon Web Scraper

AmazonScrape

Python. 2018

Tool to perform web-scraping on Amazon products with proxy and commit findings to mysql database.

Dependencies: lxml, requests, MySQLdb

With inputs: ASIN list, database name, server login credentials.

From amazon.com: ASINs Amazon Standard Identification Numbers (ASINs) are unique blocks of 10 letters and/or numbers that identify items. You can find the ASIN on the item's product information page at Amazon.com. For books, the ASIN is the same as the ISBN number, but for all other products a new ASIN is created when the item is uploaded to our catalogue. You will find an item's ASIN on the product detail page alongside further details relating to the item, which may include information such as size, number of pages (if it's a book) or number of discs (if it's a CD).
ASINs can be used to search for items in our catalogue. If you know the ASIN or ISBN of the item you are looking for, simply type it into the search box (which can be found near the top of most pages), hit the "Go" button and, if the item is listed in our catalogue, it will appear in your search results.
For example, the ASIN for Hasbro's "Monopoly" game is B00005N5PF.


on GitHub



A DataBaseObject object is created with three arguments:

  • Your MySQL login name
  • Your MySQL password
  • The name of the database where table storing product details is to be created

Then, a DataGrabber object created with the ASIN .csv or list.
  • As is, you need to have already created your database.
  • As is, you need to provide a list of ASINs. Development of ASIN generator based on keywords is in progress.
  • As is, you also need to provide a list of proxies. The ones I used were obtained quickly with the simple google search 'free proxies'.
  • This will be implemented with a front end on a webpage in the near future.
Figure 1.The Python script which is launched from your server.
"""
Tool to scrape Amazon for product details and insert them
into a database via MySQL.
-Enjoy!!!

PeerRoth, 12/2018, Help: scrapehero.com, pythonprogramming.net

Dependencies:
    lxml
    requests
    MySQLdb

Futures:
    Database and table creation
    Version compatibility
    Develop an ASIN generator
"""

import csv,os,json
from itertools import cycle
from lxml import html
import MySQLdb
import requests
from time import sleep
from types import *


class DataBaseObject(object):
	""" Create a Dictionary with MySQL login and database name """
	def __init__(self, mysql_id, mysql_password, database_name):
		self.login = mysql_id
		self.password = mysql_password
		self.dbname = database_name
	
	def getCreds(self):	
		return {'li':self.login, 'pw':self.password, 'db':self.dbname}

class DataBaseInsert(object):
	""" Insert product data into SQL database on server. """
	def __init__(self, log):
		""" Grab login, password and database name from DBO """
        self.dbname = log.get('db', None)
		self.login = log.get('li', None)
		self.password = log.get('pw', None)
	
	def insertSQL(self, details):
		""" Grab desired values from DataGrabber Dictionary """
		conn = MySQLdb.connect("localhost", self.login, self.password, self.dbname)
		c = conn.cursor()
		d = details.get("CATEGORY", "none")
		e = details.get("ORIGINAL_PRICE", "none")
		if type(e) is StringType:
			e = e
		else:
			e = 'N/A'
		f = details.get("NAME", "none")
		g = details.get("URL", "none")
		h = details.get("SALE_PRICE", "none")
		if type(h) is StringType:
			h = h
		else:
			h = 'N/A'
		i = details.get("AVAILABILITY", "none")
		# Strip newlines and long white spaces which frequently occur if no availabilty
        i = i.replace('\n','').replace('		  ','')
        # These columns were already created by user
		c.execute("INSERT INTO scrape (CATEGORY,ORIGINALPRICE,NAME,URL,SALEPRICE,AVAILABILITY) VALUES (%s,%s,%s,%s,%s,%s)",
				(d,e,f,g,h,i))
		conn.commit()
		rows = c.fetchall()

class DataGrabber(object):
	""" Grab desired attributes from Amazon product """
	def __init__(self, asin_file=None, proxy_list=None, log=None):
		""" asin_file arg must be a csv file with only asin ids with no quotations or commas """
		self.asin_list = []
		if asin_file:
			with open(asin_file) as asin_file_open:
				print('open')
				read_file = csv.DictReader(asin_file_open)
				print('readfile',read_file)
				for line in read_file:
					self.asin_list.append(line['asin'])
		else:
		    # This is a backup list with random products
            self.asin_list = [
			'B00O9A48N2',
			'B0046UR4F4',
			'B00JGTVU5A',
			'B00GJYCIVK',
			'B00EPGK7CQ',
			'B00EPGKA4G',
			'B00YW5DLB4',
			'B00KGD0628']
            
		if log:
			self.log = log 
		 
	def AmazonGet(self, url, proxies=None):
		""" Assign proxy, scrape data, append to Dictionary """
        if proxies:
			self.proxies = proxies
		else:
			# Backup list of proxies that worked on 12.2.18
            self.proxies = ['54.236.44.224:3128','205.177.86.213:8888','35.240.29.142:3128','173.192.21.89:8123']
		headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36'}
		proxy_pool = cycle(self.proxies)
		
        for i in range(1,6):
			proxy = next(proxy_pool)
			try:
				page = requests.get(url, proxies={"https":proxy}, headers={"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36"})
				while True:
					sleep(3)
					try:
						doc = html.fromstring(page.content)
						XPATH_NAME = '//h1[@id="title"]//text()'
						XPATH_SALE_PRICE = '//span[contains(@id,"ourprice") or contains(@id,"saleprice")]/text()'
						XPATH_ORIGINAL_PRICE = '//td[contains(text(),"List Price") or contains(text(),"M.R.P") or contains(text(),"Price")]/following-sibling::td/text()'
						XPATH_CATEGORY = '//a[@class="a-link-normal a-color-tertiary"]//text()'
						XPATH_AVAILABILITY = '//div[@id="availability"]//text()'
 
						RAW_NAME = doc.xpath(XPATH_NAME)
						RAW_SALE_PRICE = doc.xpath(XPATH_SALE_PRICE)
						RAW_CATEGORY = doc.xpath(XPATH_CATEGORY)
						RAW_ORIGINAL_PRICE = doc.xpath(XPATH_ORIGINAL_PRICE)
						RAw_AVAILABILITY = doc.xpath(XPATH_AVAILABILITY)
 
						NAME = ' '.join(''.join(RAW_NAME).split()) if RAW_NAME else None
						SALE_PRICE = ' '.join(''.join(RAW_SALE_PRICE).split()).strip() if RAW_SALE_PRICE else None
						CATEGORY = ' > '.join([i.strip() for i in RAW_CATEGORY]) if RAW_CATEGORY else None
						ORIGINAL_PRICE = ''.join(RAW_ORIGINAL_PRICE).strip() if RAW_ORIGINAL_PRICE else None
						AVAILABILITY = ''.join(RAw_AVAILABILITY).strip() if RAw_AVAILABILITY else None
 
						if not ORIGINAL_PRICE:
							ORIGINAL_PRICE = SALE_PRICE
 
						if page.status_code!=200:
							raise ValueError('You\'re caught.  Get new proxies, check if headers updated.')
						
                        data = {
						'NAME': NAME,
						'SALE_PRICE': SALE_PRICE,
						'CATEGORY': CATEGORY,
						'ORIGINAL_PRICE': ORIGINAL_PRICE,
						'AVAILABILITY': AVAILABILITY,
						'URL': url}
						
                        dbinsert = DataBaseInsert(self.log)
						dbinsert.insertSQL(data)
						return data
					except Exception as e:
						print('e', e)
			
			except Exception as e:
				# Some free proxies will often get connection errors.
				print(e, "Skipping proxy. Connnection error")
	
	def readAsin(self):
		""" Send elements from list of ASIN's to AmazonGet(). """
        self.AsinList = self.asin_list
        
        for i in self.AsinList:
			url = "http://www.amazon.com/dp/" + i
			self.AmazonGet(url)
			sleep(5)
		
 
if __name__ == "__main__":
	ASIN_Feeder = 	'asinlist.csv'  # FILL IN FILE CONTAINING ASINS
	Database_Name = 'amazon'		# FILL IN NAME OF DATABASE'
	Table_Name = 	'scrape'		# FILL IN NAME OF TABLE WITHIN DATABASE'
	Login_ID = 		'root'			# FILL IN YOUR MYSQL USERNAME'
	Login_PW = 		'*************' # FILL IN YOUR MYSQL PASSWORD'
	
	one = DataBaseObject(Login_ID, Login_PW, Database_Name)
	two = DataGrabber(asin_file='asinlist.csv', log=one.getCreds())
	two.readAsin()





Peter Rodinis
Development Portfolio
© New Waves Solutions 03.21.19




Search the Site.