Chapter 5 Web Servers

Last updated: 2018-11-11 20:44:59

5.1 Introduction

As mentioned in Chapter 1, a web page is a document suitable for display and distribution over the internet. At the most basic level, a web page is an HTML document which is located at a node on the internet. This node is called a server, as it serves the file to the world wide web, allowing your computer, the client, to access it.

When you open a web browser, such as Chrome, and input a URL, such as http://www.ynet.co.il, into the address bar, the web browser navigates to the node you have specified and requests this document, which it reads, interprets, and displays the contents on our screen. The browser also applies CSS styling rules (Chapter 2) and runs JavaScript (Chapter 4), in case those are linked to the document.

This means that to host a website you need to take care of two things -

You need to have the right kinds of documents and code files
You need to have a location on the internet (hardware) where you can place these documents and code files, as well as an appropriate environment (software) to serve them

We already discussed what kind of documents and code files we can use to build a website. Specifically, we learned about -

HTML, in Chapter 1
CSS, in Chapter 2
JavaScript, in Chapters 3 and 4

We have also seen seen how our customized HTML documents, any associated files, such as CSS and JavaScript, can be created using a plain text editor, such as Notepad++, then opened and viewed in the browser. The natural question that arises is what do we need to do to make the next step and turn our page into a “real” website, one that can be accessed and viewed by other people rather than just us.

In this Chapter we focus on exactly that, the second part of the picture: having a location where our documents are placed, and software to publish or serve them over the network, so that they are accessible to other people.

5.2 Web servers

The term web server can refer to hardware or software, or both of them working together, for serving content over the Internet.

On the hardware side, a web server is a computer that stores a website’s component files, such as -

HTML documents
CSS stylesheets
JavaScript files
Other types of files, such as images

The server delivers these files to the client’s device. It is connected to the Internet and can be accessed through a URL such as http://www.ynet.co.il.

On the software side, a web server includes several parts that control how web users access hosted files. The minimal required software component is an HTTP server. An HTTP server is a software component that understands URLs (web addresses) and Hypertext Transfer Protocol (HTTP) (the protocol used by the browser to communicate with the server). A server with just the HTTP server component is referred to as a static server, as opposed to a dynamic server which has several additional components.

Don’t worry if this is not clear yet - we elaborate on HTTP servers, HTTP, URLs, static and dynamic server in the following Sections 5.3 and 5.4.

At the most basic level, whenever a browser needs a file hosted on a web server, the browser requests the file via the HTTP protocol (Section 5.3 below). When the request reaches the correct web server (hardware), the HTTP server (software) sends the requested document back, also through HTTP (Figure 5.1).

FIGURE 5.1: Client-server communication through HTTP

5.3 Communicating through HTTP

5.3.1 Web protocols and HTTP

HTTP is a protocol specifying the way that communication between the client and the server takes place. As its name implies, HTTP specifies how to transfer hypertext (i.e., HTML documents) between two computers.

HTTP is not the only protocol in use for communication on the web. For example, FTP and WebSocket are examples of other web communication protocols. However, HTTP is the most basic and most commonly used protocol. Almost everything we do online, including everything we do in this book, is accomplished through HTTP communication. The secured version of HTTP, or HTTPS is becoming a very common alternative to HTTP and thus should be mentioned. However, HTTPS just adds a layer of security through encrypted communication, and is not fundamentally different from HTTP.

In the context of the web, a protocol is a set of rules for communication between two computers. HTTP, specifically, is a textual and stateless protocol.

Textual means that all commands are plain-text, which means they are human-readable
Stateless means that neither the server nor the client remember previous communications. For example, an HTTP server, relying on HTTP alone, cannot remember if you are “logged-in” with a password, or in what step you are in a purchase transaction

Furthermore, only clients can make HTTP requests, and then only to servers. Servers can only respond to a client’s HTTP request. When requesting a file via HTTP, clients must provide the file’s URL. The web server must answer every HTTP request, at least with an error message. For example, in case the requested file is not found the server may return the “404 Not Found” error message. The 404 error is so common that many servers are configured to send a customized 404 error page.

Try navigating to http://google.com/abcde.html, or any other non-existing document on http://google.com/. What you should see is Google’s customized 404 error page.

5.3.2 HTTP methods

The HTTP protocol defines several methods, or verbs, to indicate the desired action on the requested resource on the server. The two most commonly used HTTP methods for a request-response between a client and server are GET and POST. There are a few other methods which are used much more rarely.

GET - Used to request data from the server
POST - Used to submit data to be processed on the server

5.3.2.1 `GET`

The GET method is used to request data. It is by far the most commonly used method during our usual interaction with the web. For example, typing a URL in the address bar of the browser in fact instructs the browser to send a GET request to the respective server. A static server (Section 5.4.2 below) is sufficient for processing GET requests in case the requested file is physically present on the server. The response is usually an HTML document, which is then displayed in the browser, but is can also be other types of content, such as GeoJSON.

In addition to manual typing in the browser address bar, GET requests can also be sent programmatically, by running code. In this book we will frequently send GET requests using JavaScript code. For example, in the following Chapters we will learn about a method for loading GeoJSON content from files (Section 7.7.2) or from a database (Section 9.7) to be displayed on a web map, which uses GET requests.

5.3.2.2 `POST`

The POST method is used when the client sends data to be processed on the server. It is more rarely used compared to GET, and somewhat more complicated. For example, there is no way to send a POST request by simply typing a URL in the browser address tab, unlike with GET. Making a POST request to a web server only be made through code, such as JavaScript. Also, a dynamic server (see below) is required to process POST requests, where server-side scripts determine what to do with the received data. Plainly speaking, POST requests are preferred over GET requests when needing to send substantial amounts of data to the server.

In this book we will encounter just one example of using a POST request, in Chapter 13. In that Chapter, we will build a crowdsourcing web application where the user draws layers on a web map, which are subsequently sent for permanent storage in a database. Sending the drawn layer to the server (Section 13.5) is accomplished using POST requests.

5.4 Static vs. dynamic servers

5.4.1 Overview

Web servers can be divided in two general categories -

Static web servers
Dynamic web servers

What we discussed so far, and what we use in this book, refers to the simpler static servers. Dynamic servers have some additional complexity, and we will only mention them, for general information, in this Section.

5.4.2 Static servers

As noted above, a static server consists of a computer (hardware) with just an HTTP server (software). We call it “static” because the server sends its hosted files “as-is” to your browser, without any additional pre-processing. This means the server can only respond to GET requests for pre-existing HTML documents, and send those documents to the browser. While loading the HTML document, the browser may send further GET requests for other pre-existing files linked in the HTML code, such as CSS, JavaScript, images, and so on.

For example, suppose we are vising a hypothetical website focused on travel locations, and we are navigating to a specific page on travelling to France, at http://www.travel.com/locations/france.html. In case the website is served using a static server, there is an actual france.html document on the server. All the server has to do is send you a copy of that file (Figure 5.2).

For example, the web page for this book (which you are currently reading) is hosted on a static server. This means that all of the HTML documents comprising the website are prepared in advance. Entering a URL for a specific page (such as web-servers-1.html) sends the appropriate file to your browser through HTTP.

FIGURE 5.2: Static server architecture

5.4.3 Dynamic servers

A dynamic server consists of an HTTP server plus extra software, most commonly an application server and a database. We call it “dynamic” because the server dynamically builds the HTML documents, or any other type of content, before sending them to your browser via the HTTP server. Typically, the dynamic server uses its application server, i.e. software running server-side scripts, HTML templates and a database to assemble the HTML code. Once assembled, the dynamically assembled HTML content is sent via HTTP, just like static content.

With a dynamic server, when entering the above-mentioned hypothetical URL, http://www.travel.com/locations/france.html, into the address bar in our browser, the france.html document doesn’t exist yet. The server waits for your request, and when the request comes in, it uses various “ingredients” (e.g. templates, a database, etc.) and a “recipe” to create that page on the spot, just for you (Figure 5.3). For example, websites like Wikipedia have many thousands of webpages, but they aren’t real HTML documents, only a few HTML templates and a giant database. This setup makes it easier and quicker to maintain and deliver the content.

FIGURE 5.3: Dynamic server architecture

More information on the differences between static and dynamic servers can be found in the Introduction to the server side article by Mozilla.

5.4.4 Software

As we will see shortly (Section 5.6), running a static server is easy to do on our own - assuming we already have our web page documents prepared. An HTTP server is included in many software packages and libraries, and it does not require any special installation or configuration. There are numerous software solutions which can start an HTTP server in a few minutes; Python (Section 5.6.1) and R are just two examples. There are also several free cloud-based options to have a managed static server such as GitHub Pages (Section 5.6.2), which means you do not even need to have your own dedicated computer or invest in paid cloud-based services to run your static server.

There are also professional HTTP server software packages, used for building both static and dynamic servers, which we will not use in this book. Currently, two most commonly used ones are -

Setting up and running a dynamic server is more complicated, requiring specialized installation, configuration and maintenance. There are no instant solutions, such as the ones we will see shortly for static servers, since it is up to us to define the way in which the server dynamically generates HTML content. The latter is done by setting up an application server and writing custom server-side scripts.

With a dynamic server, in addition to the HTTP server software, you need to write server-side scripts which run on the server, as opposed to client-side JavaScript scripts that run on the client (Chapter 4). Server-side scripts are responsible to tasks such as generating customized HTML content, authentication, managing user sessions, etc. There are several programming languages (and frameworks) which are commonly used for writing server-side scripts -

R (Shiny) is also a dynamic server framework, though it is much more specialized (on communicating with an R session) compared to the above frameworks, which are general-purpose.

5.4.5 Practical considerations

There are advantages and disadvantages to both the static and the dynamic server approaches. Static sites (i.e. sites served with a static server) are simple, fast and cheap, but they are harder to maintain (if they are complex) and impersonal. Dynamic sites provide more flexibility and are easier to modify, but also slower, more expensive and technically more difficult to build and handle.

In this book we will only build static sites hosted using a static web server. A static server cannot use a database or template to send personalized HTML content, just pre-compiled HTML documents. Nevertheless, as we will see throughout the book, a static server is not limited to showing fixed, non-interactive content. For example, the HTML content of the web page can be modified in response to user actions through client-side scripts (in JavaScript), without needing a server, using the methods we learned in Chapter 4. Later on we will also see that static pages can dynamically “grab” information from other locations on the web, again using client-side JavaScript, including from existing servers and databases (through their APIs). That way, we can integrate dynamic content even though we do not operate our own dynamic server. However, there are things that can only be accomplished with a dynamic server.

The most notable example where a dynamic server is an obvious solution is authentication. For example, suppose we want to create a password-protected website. In our website, we can add a form where the user enters a password, and only if the password is valid - the content will be shown. This requires authentication - some way to evaluate the validity of the entered password.

Now, suppose we have a database of valid passwords for the various authorized website users. Where can we place that database and how can our page access it? If we place it directly on the client, e.g. as an array in our JavaScript script (which we know how to do from Chapter 4), the content will be exposed to anyone looking at the source code of our page. Remember that whenever a JavaScript is linked to our HTML document, the user can access the code. Even if we place the password database on a different location, such as another static server, we still need to hard-code the instructions for accessing that other location in our JavaScript code, so that the web page can access it. Again, anyone who reads those instructions can access the database the same way the browser does.

The solution is to send the user-entered password for validation using a server-side script. If the password is valid, the server can return an “OK” message and/or any content that the specific user is allowed to see. That way, the password database is not accessible, since the server is not allowed to send it - only to accept an entered password and compare it to those in the database.

From now on, we concentrate on static servers and how to set them up.

5.5 URLs and file structure

5.5.1 URLs and `index.html`

As we will see in a moment, a static server is associated with a directory on the computer, serving its contents over the web. Once the server is running, you can request any HTML document (or other type of file) which is located inside that directory, or any of its sub-directories, by entering a URL in the browser address bar.

From the point of view of a user who wants to reach a certain web page, to construct the right URL one needs to know two things -

The IP address of the host computer and the port where the server is running, or alternatively the respective Domain name (see below)
The path to the HTML document on the server

For example, the online version of the Chapter you are reading right now can be reached by entering the following URL into the browser -

http://159.89.13.241:8000/web-mapping/web-servers-1.html

Let’s break this URL into pieces -

http:// means we are communicating using HTTP, however this part is automatically completed by the browser so you don’t have to type it
159.89.13.241 is the IP address of the host computer. It is a unique address given to our computer, making it possible to identify it in the network
:8000 is the port number where the server is running (see below). When using the default port, such as 80 for HTTP or 443 for HTTPS, the port number can be omitted
/web-mapping/web-servers-1.html is the location of the document. With a static server, this means that within the directory we are serving there is a sub-directory named web-mapping, and inside it there is an HTML file named web-servers-1.html

More information on URL structure can be found in the What is a URL? article by Mozilla.

What happens if we remove the last part with the HTML file name web-servers-1.html from the URL, and thus navigating to the /web-mapping/ directory?

http://159.89.13.241:8000/web-mapping/

Try it, and you will see that a web page is displayed (the first book chapter), even though we did not specify any HTML file name. This happens because standard protocol dictates a file named index.html will be provided by default when we navigate to a directory on the web server. The index.html file usually contains the first page users see when navigating to a website.

You may now be wondering how come we are usually navigating to a textual URL such as https://www.google.com, rather than a numeric IP address and port number, such as http://216.58.205.238/? The answer is something called Domain Name Server (DNS). When you enter a URL into your browser, the DNS uses its resources to resolve the domain name into the IP address for the appropriate web server. This saves us the trouble of remembering IP addresses, using more recognizable textual addresses instead.

5.5.2 File structure

A typical file structure for a static website may look as follows -

|-- www
    |-- css
    |   |-- style.css
    |-- images
    |   |-- cat.jpg
    |-- js
        |-- main.js
    |-- dog.jpg
    |-- index.html

The various files that comprise the website are located either in the root directory (e.g. alongside the index.html), or in sub-directories within the root directory. In the above example, www represents the root directory.

In this example, the root directory www contains a default index.html document. As mentioned above, this means that when we browse to the server address without specifying any directory and/or file name the index.html file is sent and its contents displayed to the user.

The root directory www also contains sub-directories. Here these sub-directories are used for storing additional files linked in the hypothetical HTML code of index.html -

An images directory for images
A css directory for CSS (.css)
A js directory for JavaScript (.js)

It is usually convenient to have this type of sub-directory structure, where files of different types are stored in separate directories. That way, the various components that make up our website can be easier to maintain. Of course you can organize the files comprising your website any other way you prefer.

5.5.3 Relative paths

In the above file structure example, the images folder contains an image file named cat.jpg. In case this image is displayed on the index.html page, the HTML code in that file may include an <img> element (Section 1.5.9) where the src attribute refers to the cat.jpg file. Either of the following versions works -

<img src="/images/cat.jpg">
<img src="images/cat.jpg">

Both versions of the src attribute value use relative file paths, because they are relative to a given location on the server.

In the first case, the path is relative to the root directory, specified by the initial / symbol
In the second case, the path is relative to the current directory where the HTML is loaded from, so the path just starts with a directory or file name in the same location (without the / symbol)

In this particular example the index.html is in the root directory, thus the current directory is identical to the root directory. Therefore the cat.jpg file can be reached with either /images/cat.jpg or images/cat.jpg.

Incidentally, we have another image file dog.jpg in the current directory (which is also the root directory). We could refer to the dog.jpg file using just the file name -

<img src="dog.jpg">

The latter is an example of a relative path.

5.5.4 CSS and JavaScript

5.5.4.1 Overview

As mentioned in previous Chapters, CSS and JavaScript code can also be loaded from separate files, usually ending in .css and .js, respectively. This takes a little more effort than including CSS and JavaScript inside the HTML document, but saves work as our sites become more complex, because -

Instead of repeating the same CSS and JavaScript code in different pages of the website, we can load the same file in all pages
When modifying our external CSS or JavaScript code, all web pages loading those files are immediately affected

We already mentioned how CSS (Chapter 2) and JavaScript (Chapter 4) code can be loaded from external files, but will repeat for the sake of completeness.

5.5.4.2 Linking CSS

Linking an external CSS file can be done using the <link> element within the <head> of the HTML page (Section 2.6.3). In the illustrative static file structure (see above), the css folder contains a style.css file which we could link in the index.html as follows -

<link rel="stylesheet" href="/css/style.css" type="text/css">

Note that in this case it makes sense to use a relative path which is relative to the root directory, since a website usually has a single set of CSS and JavaScript files. That way, exactly the same <link> element can be embedded in all HTML files regardless of where they are placed.

5.5.4.3 Linking JavaScript

Linking a JavaScript file can be done by adding a file path in the src attribute of a <script> element (Section 4.4.2). This is usually done in the <head> element of the HTML page. For example, our sample file structure contains a folder named js with a script named main.js. The script could be loaded in the index.html document by placing the following <script> element -

<script src="/js/main.js"></script>

Again, a relative path, relative to the root directory, is being used.

5.6 Running a static server

So far we have discussed several background topics related to running a static server -

Communication through HTTP
Difference between static and dynamic servers
Components of a URL address
File structure on the server

What is left to be done is actually running a server, to see how it all works in practice. In the next two sections we will experiment with running a static web server using two different methods -

A local server, using our own computer and Python (Section 5.6.1)
A remote server, using the GitHub Pages platform (Section 5.6.2)

5.6.1 Local with Python

5.6.1.1 Setup instructions

Let’s begin with the local server option. The exercise will demonstrate the HTTP server built into Python, since it only requires you have Python installed.

If you are working in a computer classroom there is a good chance that Python is already installed. In any case, you can check if Python is installed by opening the Command Prompt (Open the Start menu, then type cmd) and typing python.

If you see a message with the Python version number, then Python is installed and you have just entered its command line, marked by the >>> symbol. You can exit the Python command line by typing exit() and pressing Enter.

If you see an error message, such as the following one then Python is not installed.

'python' is not recognized as an internal or external command, 
operable program or batch file.

Python installation instructions are beyond the scope of the book, but there is a plenty of information on installing Python that you can find online.

5.6.1.2 Running the server

To run Python’s HTTP Server, follow these steps -

Open the Start menu and type cmd to enter the Command Prompt.
Navigate to the directory that you want to serve (e.g. where your index.html file is), using cd (change directory) followed by directory name. For example, if your directory is in Drive D:, inside a folder named Data and then the server directories, you need to type cd D:\Data\server.
Type the expression python -m SimpleHTTPServer (if you are using Python 2) or python -m http.server (if you are using Python 3). To check which version of Python you have, type python -V in the Command Prompt.

If all goes well, you should see a message such as the following, meaning that the server is running.

Serving HTTP on 0.0.0.0 port 8000 ...

As evident from the above message, the default port (Section 5.5.1) for the server is 8000. In case you want to use a different port, such as 8080, you can specify port number as an additional parameter -

python -m SimpleHTTPServer 8080  # Python 2
python -m http.server 8080       # Python 3

To stop the server, press Ctrl+C. To exit python type exit() and press Enter.

5.6.1.3 Testing served page

Once the server is running, you can access the served web page(s) by navigating to the following address in a web browser, assuming there is an index.html file in the root of the served directory.

http://localhost:8000

In case you want to load an HTML document other than index.html, or if the document is located in one of the sub-directories, you can explicitly specify the path to the HTML file you wish to load, for example -

http://localhost:8000/dir1/document2.html

If you initiated the server on a different port, replace the 8000 with the port number you chose.

The word localhost means you are accessing this computer. In other words, the server and the client are the same computer. This kind of setting is not very useful for sharing our site with wider audience, but essential for development and testing.

The server usually reports on incoming requests. For example, the screenshot on Figure 5.4 shows several logged messages printed while using Python’s HTTP Server.

FIGURE 5.4: Running Python’s simple HTTP server

We can see the server successfully processed four GET requests, all of which took place in 2018, on February 22 around 15:08.

5.6.1.4 Interactive map example

The following three exercises will demonstrate the concept of running a static server, using Python’s HTTP server.

In the first exercise, we will run a simple HTTP server to serve a web page with an interactive map of towns in Israel. Don’t worry if you don’t understand most of the code and the purpose of each file. We will cover all of that in the following Chapters.

Let’s try using a static HTTP server to serve a web page over the network.

Download the files example-07-03.html and towns.geojson from the book data (see Appendix A) and place them in an empty directory. The first file is an HTML document, while the second one contains GeoJSON (Section 3.10.2).

Rename the example-07-03.html file to index.html

Open the renamed index.html file in a text editor and locate the line where it says $.getJSON("data/towns.geojson", function(data) {. This implies that the GeoJSON content will be loaded from the data sub-directory. Since in our case both files are placed in the root folder, change "data/towns.geojson" to "towns.geojson" and save the modified index.html file

Start a local server in the directory where both files are placed, and open the page in the browser by navigating to http://localhost:8000

You should see a map of town borders, with highlighted names on mouse hover (Figure 7.9)

Now try opening the index.html file by double-clicking on it

The towns layer should now be absent, because loading GeoJSON content from a file in the particular method being used is usually blocked by the browser unless running a server, as we will see in Section 7.7.2. This demontrates the necessity of going through the trouble of running a local server, for correctly emulating the way that web content is being served during website development

5.6.1.5 Access from a different computer

In the second exercise, we will try to navigate to the served page from a different computer. Note that this exercise will not work under certain network settings, due to different complications that require some more effort to overcome.

For example -

If you are connected to the internet through a private network, for example behind a router at your home, then the IP address of your computer (shown with ipconfig, see below) refers to an internal address of the private network, so other computers will not be able to reach your page by typing it in the browser
If there is a firewall preventing other computers from reaching yours, then they will not be able to navigate to the page your are serving
In realistic settings, you need to have a static (constant) IP address, while most computers are assigned with a different IP address each time they connect to the network

The above considerations are handled by network administrators and are beyond the scope of this book.

While the page from the previous exercise is up and running, let’s try to access it from a different computer over the network. This exercise should be done in pairs.

Start up the static server with the interactive map from the previous example

Identify the IP address of your computer. To do that, Click on the Start button, type cmd in the text box to open a second command line prompt (the first one should still runs your server so you cannot type additional commands there), then type ipconfig in the command line. Next, locate your IP address in the printed output (Figure 5.5). The address is listed next to the line where it says IPv4 Address. For example, in the output shown on Figure 5.5 the IP address is 132.72.129.98

Tell the person next to you what your IP address is, and which port is your server is running on. For example, if your IP address was 132.72.129.98 and the port number is 8000 then the address you should pass to the person next to you is http://132.72.129.98:8000

The other person will type the address in his/her browser. If all worked well, your website will be displayed. Check your server’s log - you should see the GET request(s) and the IP of the other computer which connected to your website!

Determining the IP address using <code>ipconfig</code>

FIGURE 5.5: Determining the IP address using ipconfig

5.6.1.6 Configuring linked CSS and JavaScript

Finally, in the third exercise we will experiment with linking CSS and JavaScript files in a static website.

Select one of the examples in from Chapter 4

Download the HTML document for the example you chose, rename it to index.html, and place it in a empty directory

Open the index.html file in a text editor

Create a text file with the .css extension (i.e. an external CSS file), type one or more relevant styling rules into it, and place the file in a directory named css inside the directory where index.html is located

Link the CSS file to the index.html (Section 5.5.4.2)

Cut the JavaScript code out of the <script> element in index.html, and paste it in a text file with a .js extension. For example, you can name the file main.js.

Place the .js file you created in a directory named js inside the directory where your index.html file is located

Link the JavaScript file to the index.html (Section 5.5.4.3)

Serve the directory with the index.html file

Access the localhost address to make sure the web page works as expected. You can check the server log to see how the CSS and JavaScript files were actually sent by the server

5.6.2 Remote with GitHub Pages

5.6.2.1 Overview

Python’s HTTP server is simple enough to start working with, but there are other difficulties if you intend to use if for “production”, i.e. in real-life scenarios, where stability is essential. For instance, you need to take care of the above-mentioned network administration issues (Section 5.6.1.5), such as making sure your server has a constant IP address and that it is not behind a firewall. In addition, you need to take care of the hardware of your server, such as making sure the computer is always running and connected to the internet, that the server is restarted in case the computer restarts, and so on. Using a remote hosting service, we basically let other people handle all of that. In other words, we don’t need to worry for any of the hardware, software, and network connection issues - just the contents of our website.

There are numerous hosting services for static web pages. For example, both Google and Amazon, as well as many other smaller companies, offer (paid) static hosting services. In this Section, we will use the GitHub platform for hosting our static web page hosting, which is free.

Although GitHub is mainly a platform for online storage of Git repositories and collaborative code development, one of its “side” functions is that of a static server. The static server functionality of GitHub is known as GitHub Pages.

Using GitHub Pages as a remote static server has several advantages for our purposes -

It is simple
It is free
It is part of GitHub, a popular platform for collaborative code development which is helpful to become familiar with

5.6.2.2 Git and GitHub

We will not go into details into the functionality of GitHub other than the GitHub Pages utility, but here is some background on what it is.

When working with code, it becomes important to keep track of different versions of your projects. This allows you to undo changes made weeks or months ago. Versioning becomes even more important when collaborating with others, since in that case you may need to split your project into several “branches”, or “merge” the changes contributed by several collaborators back together. To get around this, people use version-control systems. One of the most popular revision control systems around today is Git.

If Git is a version-control system, then what is GitHub? Git projects are also called repositories. GitHub is a web-based Git repository hosting service. Basically, GitHub is an online service where you can store your Git repositories, either publicly or privately. The platform also contains facilities for interacting with other people, such as raising and discussing issues or subscribing to updates on repositories and developers you are interested in, creating a community of online code-collaboration. For anyone who wants to take part of open-source software development, using Git and GitHub is probably the most important skill after knowing how to write the code itself.

Importantly to our cause, for any public GitHub repository, the user can trigger the GitHub Pages static server. As a result, the contents of the repository will be automatically hosted at the following address -

http://GITHUB_USER_NAME.github.io/REPOSITORY_NAME/

Where -

GITHUB_USER_NAME is the user name
REPOSITORY_NAME is the repository name

Note that the slash / at the end of the address, which means that we are looking for a directory. The slash is important in this case, since GitHub pages does not complete it automatically.

Everything we learned about static servers applies in remote hosting too. The only difference is that the served directory is stored on another, remote server, rather than your own computer. For example, in order for a web page to be loaded when one enters a repository URL as shown above, you need to have an index.html file in the root directory of the repository.

5.6.2.3 Setup instructions

What follows are step-by-step instructions for starting a remote static website on GitHub Pages.

To host our website on GitHub pages, go through the following steps -

Create an account on https://github.com/ (in case you don’t have one already), then sign-in to your GitHub account. Your username will be included in all of the URLs for GitHub pages you create, as in GITHUB_USER_NAME shown above. In the screenshots shown below the GitHub username is michaeldorman
Once you are logged-in on https://github.com/, click the + symbol on the top-right corner and select New repository (Figure 5.6)

FIGURE 5.6: Creating a new repository on GitHub

Choose a name for your repository. This is the REPOSITORY_NAME part that users type when navigating to your site, as shown in the above URL template. In the screenshots shown below the repository name is test
Make sure the “Initialize this repository with a README” box is checked. This will create an (empty) README.md file in your repository, thus exposing the Upload Files screen which we will use to upload files into our repository
Click the Create repository button (Figure 5.7)

FIGURE 5.7: The Create Repository button

The repository page you just created should look like the one shown on Figure 5.8. The repository is empty, except for one file named README.md

Newly created repository, with <code>README.md</code> file

FIGURE 5.8: Newly created repository, with README.md file

Click on the Settings tab to reach the repository settings
On the settings page, scroll down to the GitHub Pages section
In the Source panel, instead of None select master branch and click the Save button to the right of the dropdown menu (Figure 5.9)

Setting <code>master</code> branch as GitHub Pages source

FIGURE 5.9: Setting master branch as GitHub Pages source

Go back to the repository page and click the Upload files button. This will take you to the file upload screen (Figure 5.10)

FIGURE 5.10: File upload screen

Drag and drop all files and folders that comprise your website into the box. This should usually include at least an HTML document named index.html
Wait for the files to be transferred. Once all files are uploaded, click the Commit changes button (Figure 5.11)

FIGURE 5.11: The Commit changes button

That’s it! Your website should be live at the following address. Replace GITHUB_USER_NAME and REPOSITORY_NAME with your own GitHub user name and repository names. Note that you may have to wait a few moments before the site is being set up.

FIGURE 5.12: The repository with uploaded files