Round and Round – Using Generators in Python

Overview

Most all modern programming languages support constructs for storing lists of data – think C++ arrays, Java ArrayLists, and Python lists. Any time we have a list of data, it’s often necessary to use loops to perform some operation on each item in that list. Again, most modern languages support ways of looping over these lists of data, most commonly “for” loops.

In Python, it’s trivial to iterate over a list using a for loop. However, Python lists are stored in memory. What happens if the data set you need to operate on is large and therefore memory inefficient? Python offers constructs called iterators that allow one to loop over data sets that are not stored in memory. Some iterators are built in for things like file reading, but you can easily create your own custom iterators using a concept called generators!

Iterators? Generators? Combobulators?

Traditional List Iteration

It’s easy to create a list of numbers in Python an loop over that list. Let’s say for example, we want to create a list of multiples of 2. We’ll store 5 numbers in this list. For each number in this list, we’ll just print it out to the screen for now:

def print_multiples():

    multiples = get_multiples()
    for m in multiples:
        print m

def get_multiples():

    return [ 2, 4, 6, 8, 10 ]

if __name__ == "__main__":

    print_multiples()

Let’s step through this code a bit and explain how it works. When we enter the print_multiples function from main, the first call multiples = get_multiples() assigns the return value of get_multiples() to the multiples variable.

The multiples variable now stores an in-memory list with all of our multiple of 2 values – 2, 4, 6, 8, and 10. We next go to the for loop for m in multiples:. These Python for loops are slick and fairly straightforward – for each go around the loop, the next value in the list is stored directly in the variable m. The loops continues until each value in the list is exhausted.

The output looks like this:

python test.py
2
4
6
8
10

So what’s a generator look like?

Now let’s try this code again, using Python’s generator construct. Here’s what that looks like:

def print_multiples():

    multiples = get_multiples()
    for m in multiples:
        print m

def get_multiples():

    for i in range(1, 6):
        yield i * 2

if __name__ == "__main__":
    print_multiples()

Notice our output is the same:

python test.py
2
4
6
8
10

This code requires some closer examination to understand how it works. When we assign multiples = get_multiples(), we don’t a assign a list, we assign what’s known in Python as an iterator. An iterator object exposes a next method that allows for loops to retrieve each sequential item in a list or other iterable object.

When we enter our for loop this time, we don’t iterate over the list – instead our iterator uses the get_multiples() generator function to retrieve each item one by one. The first time we go around the for loop in print_multiples, we enter the get_multiples function and enter its for loop.

Now, you’ll notice a different keyword being using in get_multiples – instead of returning an entire list, the function only yields one item, the result of i * 2. The value is passed through the iterator’s next function and printed to the screen. The next time around the for loop in print_multiples, the code goes to the same spot the value was yielded from in get_multiples. The return keyword returns code control to the caller, whereas the yield keyword only temporarily yields control back to the caller until the next time the caller needs a value from the iterator. The get_multiples function’s for loop continues, and yields the next multiple of 2. The function will continue yielding values of 2 until it’s for loop ends, yielding 10.

Cool, so why would I want to use generators instead of a list?

Our trivial example makes the use of generators pretty clear, but it doesn’t explain why they’re actually useful. Why all the extra complexity to avoid storing a whole 5 numbers in memory?. The use case of generators goes far beyond small lists of numbers.

First, what if we wanted to display the first one billion multiples of 2? In that case, it becomes much more expensive to store one billion integers in memory – it would eat up over a gigagbyte!.

In real-world software engineering applications, we often use generators to deal with large datasets even beyond simple calculations such as this. Generators can be used to operate on large database queries or data read from files on disk, where it would be inefficient or even impossible to read the information into memory.

Generate your own generators

In this article, we’ve explained how to go beyond memory-stored lists of information and create our own generators. Instead of hogging memory for large data operations, we can make our own memory-efficient iterators. Now when you have large calculations, database queries, or file reads to worry about, you can keep your memory usage low and your code easy to understand thanks to Python!

Configuring SSL with Apache and Let’s Encrypt (Part 2 – Manual Apache Configuration)

Overview

In the first part of “Configuring SSL with Apache and Let’s Encrypt”, we discussed why you should be offering HTTPS connections to your site, and how to get a free certificate from Let’s Encrypt. That tutorial shows commands that use CertBot to automatically configure your Apache web server with the new certificate.

However, if you’re like me you may want some more fine-tuned control over how your web server is configured. We can use CertBot to just retrieve a certificate for us and set up Apache virtual hosts by hand. You can even force connections to your site to use HTTPS all the time.

Configuring HTTPS with Apache By Hand

Finding your Let’s Encrypt certificates

In the case you want to hand-configure your server, you can ask CertBot to just fetch the certificates and leave Apache alone with:

sudo certbot --apache certonly

This will create a folder containing the files you’ll need for that domain. You’ll need these two files:

/etc/letsencrypt/live/mydomain/fullchain.pem
/etc/letsencrypt/live/mydomain/privkey.pem

fullchain.pem is the certificate itself, and privkey.pem is the certificate’s private key. Make sure you keep these safe, especially the private key!

You can then copy these two files to their respective directories in /etc/ssl. If you want, you can change the file names. I like to name mine with the site and use .crt/.key for the file extensions:

sudo cp fullchain.pem /etc/ssl/certs/mydomain.crt
sudo cp privkey.pem /etc/ssl/private/mydomain.key

Setting up Apache Virtual Hosts with SSL

The next step is to set up Apache virtual hosts to use our certificate. Virtual hosts are a great tool that allow you to manage multiple websites (with multiple domains) on one web server. They also allow you to configure things like which SSL certificates to use and which folders to serve for your site, so it is valuable to learn how they work.

Your Apache SSL virtual hosts files will most likely be in the /etc/apache2/sites-available directory. For my version of Apache, the SSL vhosts file is called default-ssl.conf.

Let’s take a look at what a basic SSL-configured virtual host definition looks like:


<IfModule mod_ssl.c>

 <VirtualHost *:443>

  ServerAdmin you@yoursite.net

  DocumentRoot /var/www/html/yoursite

  ServerName yoursite.net
  ServerAlias www.yoursite.net

  SSLEngine on
  SSLCertificateFile /etc/ssl/certs/yoursite.crt
  SSLCertificateKeyFile /etc/ssl/private/yoursite.key

  <FilesMatch "\.(cgi|shtml|phtml|php)$">
    SSLOptions +StdEnvVars
  </FilesMatch>
  <Directory /usr/lib/cgi-bin>
    SSLOptions +StdEnvVars
  </Directory>

  BrowserMatch "MSIE [2-6]" \
    nokeepalive ssl-unclean-shutdown \
    downgrade-1.0 force-response-1.0
  BrowserMatch "MSIE [17-9]" ssl-unclean-shutdown

 </VirtualHost>

</IfModule>

Let’s break down what this all means, especially the SSL part.

  ServerAdmin you@yoursite.net

  DocumentRoot /var/www/html/yoursite

  ServerName yoursite.net
  ServerAlias www.yoursite.net

This first block of information is the same as a virtual host for plain HTTP. This information tells apache what to do if someone comes to your web server from the domain name yoursite.net. If a client makes a request, Apache will serve files starting at the document root /var/www/html/yoursite. All of the files your site will serve should be contained in that folder. The ServerAdmin bit tells someone who they can contact if something is wrong. According to this Stack Overflow question though, that feature is deprecated so you may want to omit it.


  SSLEngine on
  SSLCertificateFile /etc/ssl/certs/yoursite.crt
  SSLCertificateKeyFile /etc/ssl/private/yoursite.key

  <FilesMatch "\.(cgi|shtml|phtml|php)$">
    SSLOptions +StdEnvVars
  </FilesMatch>
  <Directory /usr/lib/cgi-bin>
    SSLOptions +StdEnvVars
  </Directory>

  BrowserMatch "MSIE [2-6]" \
    nokeepalive ssl-unclean-shutdown \
    downgrade-1.0 force-response-1.0
  BrowserMatch "MSIE [17-9]" ssl-unclean-shutdown

This second block of code starts the configuration of SSL. It tells Apache that connections to this site via HTTPS can be accepted, and includes some information about what to do with certain browsers and CGI requests. You can safely leave the bottom portion alone; it is included by default and I’ve not found a reason in my experience to touch it.

The more important piece you’ll need to edit is the section containing SSLCertificateFile and SSLCertificateKeyFile. Make sure that those paths match up to the files you copied to /etc/ssl/certs and /etc/ssl/private. Those directives tell Apache what certificate file to use for incoming SSL connections, and the private key that will be used to decrypt data sent by the client.

Once you’ve got that file correctly edited, you can restart Apache using the command sudo service apache2 restart. If you get an error here, there is most likely a syntax problem in the default-ssl.conf file that you just edited.

Forcing HTTPS connections

Now that you’ve got a shiny, new, CA-signed SSL certificate configured for Apache, you may want to do away with HTTP connections altogether. Fortunately, Apache makes this fairly simple, again with the use of virtual hosts.

This time, you’ll be editing the HTTP vhosts file. In /etc/apache2/sites-available, my version of Apache has a file called 000-default.conf. Open this up in your favorite text editor, and add the following to each virtual host you want to force HTTPS with:

Redirect / https://mysite.net

This directive tells apache to redirect any plain text connection to the document root of that site to redirect to the document root of your site with HTTPS. One thing I’ve noticed is that this can cause issues with HTTP links to subdirectories like http://mysite.net/subdir, so you’ll want to update any links in your site to use HTTPS as well.

Other solutions include using the rewrite engine, but Apache discourages doing so for this use case.

Configuring Securely Encrypted Connections to Your Web Server

It’s now easy and free to get a CA-signed SSL certificate thanks to Let’s Encrypt and CertBot. Combined with the highly configurable Apache web server, it’s fairly straightforward to get your websites tuned up to use HTTPS with your desired site setup in mind.

Using secure connections is a great way to provide your users with increased security and privacy, preventing man-in-the-middle attackers from getting their sensitive information or interfering with the content you provide. Happy encrypting!

Configuring SSL with Apache and Let’s Encrypt (Part 1 – Let’s Encrypt and CertBot)

Overview

With the robustness of modern webpages, HTTPS is the way to go for almost any website. Most of the time an individual is online, they’re exchanging sensitive information like passwords and personal information. If you’re logging into a site with a password, a securely encrypted connection is a must.

However, even informational sites should use encryption! If you’re running a site for a small business, a topic that interests you, or even your own portfolio, your users can benefit from increased privacy and security when they connect using HTTPS rather than plain text HTTP.

Configuring SSL for a Website

Why SSL?

It’s important to roughly understand how HTTPS works, and why using it will improve the privacy and security of your users. From the 10,000 foot view, what SSL does is encrypt all the traffic between the client (the user’s browser) and the server (your web server). Everything beyond the domain name is encrypted, and therefore hidden from a man-in-the middle attacker.

From a security perspective, this prevents an adversary on the network from modifying content en-route to the user. Imagine you’re an application developer and you have software available for download. You even go so far as to supply a SHA-256 fingerprint of your application so that users can verify its contents. If you’re transmitting your application over HTTP, a malicious man-in-the-middle could modify the application and fingerprint in transit, giving your end user a nice dose of malware. But if you allows your users to download over a secure connection, the attacker could not see or change either component on its way to your (happy, malware free) consumer.

From a privacy perspective, HTTPS prevents a man-in-the-middle from snooping on what your user is viewing on your website. The only thing someone on the network could see is that they are connecting to your particular domain, no specific URLs or content. Imagine your user is doing research on a sensitive topic, or trying to download a piece of software that someone doesn’t want them to have. To some extent, HTTPS protects that person’s privacy by preventing a snooper from understanding what they see going over the wire.

Obtaining an SSL Certificate with Let’s Encrypt

If you’re sufficiently convinced that your website should use SSL, the good news is that you can set that up for free! The awesome folks over at Let’s Encrypt have developed a service that allows you to obtain your very own, CA signed SSL certificate from the command line.

To actually get the certificate for your system, you’ll want to use an ACME client called CertBot from the Electronic Frontier Foundation. CertBot uses the ACME protocol to automatically verify your ownership of the domain you want a certificate for and fetch that certificate from the Let’s Encrypt service.

The CertBot website has instructions for all kinds of web servers and server operating systems. You can visit that site if you need further help for a particular combination, but in this article I’ll show you how to get a certificate on Ubuntu Server for Apache (that’s what I use). To install the CertBot application, run:


sudo apt-get update
sudo apt-get install software-properties-common
sudo add-apt-repository ppa:certbot/certbot
sudo apt-get update
sudo apt-get install python-certbot-apache

Once certbot is installed, you have two options for fetching a certificate. If you only have one website with a basic configuration, you can have CertBot automatically configure Apache to use the SSL certificate it will generate. This is done using:

sudo certbot --apache

That’s all you need! You can now visit your website and you should see a little green lock in your browser indicating that you’re connected via HTTPS with a valid certificate.

Using HTTPS to Keep Your Users Secure

It’s important to encrypt traffic between your server and the clients browsing your site so that sensitive information cannot be snooped or manipulated by someone on the network. This article discusses why you should be using HTTPS for your websites, and how to get a free certificate with EFF’s CertBot and Let’s Encrypt. The Apache configuration is done automatically using the commands shown here.

However, if you’re like me, you may prefer a little more fine-tuned control over your system configuration. The next article will go more in depth on hand-tuning Apache for your needs – we’ll cover certificate-only CertBot usage, making SSL virtual hosts, and forcing secure connections.

Avoiding “Mixed Content” Issues In Your Client-Side Scripts

Overview

Web applications are particularly vulnerable to security issues, given that the internet is one giant, unsecured network that anyone can access. One of these major vulnerabilities is man-in-the middle attacks, where a malicious user accesses content sent between the client (the machine receiving web content) and the server (the machine providing web content).

Fortunately, the widespread adoption of SSL/TLS has improved the security of web pages by encrypting content between client and server and making it difficult for man-in-the-middle attacks to occur. It is important, however, that web pages and web applications are properly marked up and coded to avoid having content sent unsecured.

Avoiding mixed content for greater security

What is mixed content?

Mixed content refers to unsecured (HTTP) content loaded within a secure page (HTTPS). For example, this content could be an image, video, or audio file requested by the page with an unsecure URL. This could also be a script loaded by the page over HTTP. Scripts loaded unsecurely can potentially be more dangerous than media files like images.

Why is mixed content a problem?

Privacy concerns

Mixed content can create several problems for an end user of your website. The first issue is that of privacy. When a user loads a page over SSL/TLS (HTTPS), the only thing someone snooping on the network (a “man-in-the-middle”) can see is the domain of the page being requested. For example, a if user visiting https://somesite.com/sensitive-article, the only thing a snooper would see is that the user is requesting content from somesite.com. They can’t see the particular page the user is requesting. However, what if the page sensitive-article requests an image over HTTP? Then the snooper could see the unencrypted image, possibly giving them clues about what the end user is viewing!

When browsing with HTTPS, the user has some expectation of privacy between his or her self and the server. Mixed content breaks that by exposing some parts of the page in plaintext as they travel across the wire, potentially giving some clues as to what the user is viewing.

Code security concerns

A second concern with mixed content is that a script sent in plain text could potentially be modified en-route to the client. If a script is sent over HTTP, a malicious user could potentially intercept that script and inject some malicious code of their own. This code could be used to collect sensitive information or trick the user into visiting an illegitimate website (very common with phishing scams). The end user expects a legitimate script sent securely from the server (they requested a page over HTTPS, after all), but in a mixed content scenario this code could be intercepted and modified by an attacker.

How to avoid mixed content

Use relative links as much as possible

One of the easiest ways to avoid mixed content in web pages is to use relative links. With a relative link, the browser will use the protocol the page was requested with by default, no hard-coding of a protocol required. For example, loading your images from a subfolder called photos can be done with a relative path like so: src="photos/myimage.jpg". If the user requests your page over HTTPS, the image will be loaded over HTTPS as well.

Use // to auto-detect the protocol for outside links, or always use HTTPS

Sometimes you may want to load content from outside your own domain and your own web server. Many developers like to load scripts like jQuery from Content Delivery Networks (CDN’s) rather than store those scripts on their own server. In this case, you can use // in front of the URL instead of a hard coded protocol like http://. This will match the protocol the page was requested from. For example, src="//cdn.jquery.com/some-jquery-version.js". Better yet, if you know the site provides content over HTTPS, just use that by default!

Avoiding mixed content is fairly straightforward!

While mixed content can pose several issues for user privacy and application security, it is thankfully trivial to mitigate. Avoiding mixed content problems is as easy as fixing the links in any pages your site provides. If you’re not sure your pages avoid this problem, the developer console in a modern browser like Firefox will tell you if you’re trying to load insecure content. Many browsers (including Firefox) even block that content from loading. Be sure to be mindful of how media and scripts are requested in your web applications, and your users will be more secure for it!

Starting to Secure MySQL (Part 2 – Principle of Least Privilege)

Overview

In the last article, we discussed a solid starting point for securing MySQL installations using mysql_secure_installation. Running this script is great way to eliminate some default security flaws with MySQL, but it is just a starting point.

In MySQL, multiple database users can be created for different hosts and privileges can be set for different databases. In order to further secure each database created for a particular application, it is important to look at how users are created and what privileges are granted on that database.

Creating users for the database

Why isolated users improve security

Say you’re creating a web application and need to supply a database user and password in your backend code so you can connect to the database and select some records to display. It might be tempting to just plug in the root user. After all, root has all the privileges you need and it’s already built in, so it should just work.

But what happens if say, PHP isn’t properly configured and a user requests your webpage from the server? Now they have the root username and password since the source code was returned to them instead of executed! A malicious attacker could gain access to your server through another vulnerability and wreak havoc on your databases.

This is the first part of using the “Principle of Least Privilege” to secure your database. It’s a really good idea to have a user account (or multiple user accounts) with access to only one database. Don’t configure this user to have privileges on any other database. In the event that user’s credentials are compromised, the damage can at least be isolated to only one database.

What about hosts?

Another important consideration for creating users is where they have privileges to access the database from. With a database engine like MySQL, it’s possible to connect directly to the database from a remote location. You could configure a user for your web application’s database to connect from any host, so you could always work on the database from a coffee shop or a friend’s house.

However, this is another important place to limit privileges to the least amount needed to be operational. In the above scenario, if an attacker gets the credentials for your database user, they could log in from their bedroom in their PJ’s and start messing with your database!

Instead of allowing a user to connect from any host, consider the only place a user should be able to connect from to make the database functional. In a web application scenario localhost is often a good choice because it limits the user to connecting from the web server the application is running on. Now if an attacker gets access to your user’s credentials, they would have to also gain shell access to the server. This presents an additional obstacle between them and your database that can be difficult to overcome with a properly configured server.

How to create a user for a specific host

In MySQL, creating a user for a specific host only requires one command that is fairly straightforward. For example:

CREATE USER "MyUser"@"localhost" IDENTIFIED BY "MyLongAndStrongPassphrase!";

The above example creates a user called MyUser that can only log in from localhost. The IDENTIFIED BY part tells the database engine what password the user will log in with. Be sure to create a unique, strong password for this user.

Configuring user privileges

What privileges should a user have?

Much like the question of “what hosts should I allow my user to connect to?”, the question of what privileges a user should be granted comes down to the least amount of privileges needed for the user to do its job.

Just like it’s easy to use the root user for everything, it’s easy to grant all privileges to a user for your database so everything just works. But again consider the scenario where an attacker gains access to that user’s credentials. Even if that user can only log into one database, they could drop tables or modify records maliciously. Imagine an attacker being able to modify the price of an expensive item in your web store to any amount they want!

Now in that scenario, the only access your web store needs to the database is to SELECTthe prices of available items. Why give the user access to update or delete records at all? This is the core of the “Principle of Least Privilege”. Only give users the minimum amount of privileges they need to do the job they do. In our web store example, that means they only need the ability to do a SELECT on the database and nothing more.

How to grant user privileges in MySQL

In MySQL, specifying user privileges is done using the GRANT statement. For example:

GRANT SELECT ON MyWebStore.PRICES to "MyUser"@"localhost";

The first part of this command (the GRANT SELECT) tells the engine that this user will get the SELECT privilege only. There are a bunch more, but some of the common ones include INSERT, UPDATE, DELETE, SELECT, DROP, CREATE.

The second part of this command (MyWebStore.PRICES)specifies which database and which table(s) the privileges will apply to. If we’re following the “Principle of Least Privilege”, this user should only be granted these privileges on one database. As for tables, you may want to specify one table or all tables in the database depending on your application. You can apply the privileges to all tables using the * wildcard operator, like MyWebStore.*.

The final part of this command ("MyUser"@"localhost") specifies which user (on which host) the privileges will apply to.

There is one final step, and that’s to immediately apply all these privileges. In MySQL the command FLUSH PRIVILEGES immediately commits the changes to the database.

Being mindful

Applying the “Principle of Least Privilege” is another important methodology for securing MySQL database installations against attackers. It’s important to be mindful of the needs of your application when creating users and granting privileges, so that each user has the minimal amount of access needed to do their job. By isolating users to one database, one or few hosts, with a minimal amount of privileges, we can isolate and minimize the potential done by an attacker that gains access to database user credentials.

Starting to Secure MySQL (Part 1 – mysql_secure_installation)

Overview

Databases should always secured, regardless of the environment. However, database security is particularly important for web-facing applications. MySQL is a very popular database especially for web development, as a part of the LAMP (Linux-Apache-MySQL-PHP) stack.

Database security can be a complex topic, but there are a few basic security measures everyone can and should use when building applications that use MySQL.

The mysql_secure_installation script

Fortunately, the developers of MySQL wanted to make security accessible. They’ve added a great security script to the package called mysql_secure_installation. This script allows the user to quickly attack some basic security flaws in the default MySQL installation.

What it does

This script addresses several issues with the default installation:

  • Sets a password for the root user
  • Disallows root login from remote hosts
  • Removes anonymous users
  • Removes the test database

Why these measures improve security

    • Sets a password for the root user

For any system with an “administrative” user, it is paramount that not just anyone can access that account and the privileges associated with it. Giving the root user a secure password prevents an unauthorized person from modifying databases, users, or user privileges

    • Disallows root login from remote hosts

Similar to setting a secure password for root, preventing root login from remote hosts is an extra measure that prevents an unauthorized user from having access to do anything they want to your database installation.

  • Removes anonymous users

By default, MySQL allows anyone without a user account to connect to the database engine for testing purposes. This is undesirable because even though anonymous users may not have the same privileges as root, it could allow an unauthorized user to snoop around your MySQL installation and see how databases are defined, what users are available, etc. Removing anonymous users means that only an authorized user with a password can connect to MySQL.

  • Removes the test database

Another default feature of MySQL is a test database that anyone can connect to. Since anyone can connect to the database, this gives an attacker a starting point for access to your database if there are any vulnerabilities. We don’t want any databases that allow unauthorized users to connect.

How to use this feature

Since this application comes bundled with MySQL, it is fairly straightforward to run. At a terminal window, execute mysql_secure_installtion. You’ll see several prompts for each security enhancement:

Change the root password? [Y/n]
Remove anonymous users? [Y/n]
Disallow root login remotely? [Y/n]
Remove test database and access to it? [Y/n]
Reload privilege tables now? [Y/n]

The last item ensures that the changes made by the script take effect immediately. It is best practice to answer Y (yes) to all of the prompts!

Next Steps

All of the features provided by mysql_secure_installation are a great starting point for securing your MySQL installations. In another blog, I’ll address how to set user privileges for your database following the “Principle of Least Privilege”.