Scientific Computing and Data / High Performance Computing / Documentation / Web Application: Web Server

Web Services

Usage, Security/SSL, Availability and Accessibility

Getting Started with User-based Web Services

GPG

Web Services – General

In General, Minerva web services is a standard Linux-Based web hosting services. Services are provided through Apache 2.4 with the MPM-ITK addon. It is designed to support basic web pages for the intent and purpose of providing web-based science portals and other applications to interact with the jobs, queues, and large science datasets on Minerva. The web services should be on-par with common unix-based web hosting services such as Dreamhost etc. Most common web frameworks and applications should easily fit into this service such as: Django, WordPress, Wikimedia, Custom Perl apps, and static content.

Server-side content is not run as the common “apache” user, but instead as the owner of that application directory. That is, each user’s content/scripts run as that user.

As of October 1st 2022, there are two different Domain Name System for user website’s landing point with different network access:

https://userid.u.hpc.mssm.edu for internal websites
https://userid.dmz.hpc.mssm.edu for public websites

By default, each user’s default web services landing point is https://userid.u.hpc.mssm.edu, with only internal access (campus network or VPN are needed for access).

If you need public websites for your research, please fill out the form or at https://redcap.link/g08ytzki. Once the request is received, the IT security team will scan the web application. If no critical/high vulnerabilities reported, we will move the webpage to userid.dmz.hpc.mssm.edu for public access. The time to complete this request will be depending on the vulnerability status of the website. A rough estimate is 1 week.

Account Creation Time Delay: It may take between 30min – 1hour after user account creation for the system to automatically subscribe a user to web services at https://userid.u.hpc.mssm.edu. If you are a new user, please endure the delay and check web services again before submitting a trouble ticket.

Usage, Security/SSL, Availability and Accessibility

The Minerva web services are designed to support basic web pages for the intent and purpose of providing web-based science portals and other application interactions and to interact with the jobs, queues, and large science datasets on Minerva. The web services should be on-par with common unix-based web hosting services such as Dreamhost etc. That said, we don’t recommend using web services as a primary website for a department or lab, websites with heavy traffic, or for websites that require a guaranteed amount of uptime. There are no uptime guarantees. Web services may go down during Preventative Maintenance periods. We may block IPs with suspicious activity for periods of time, possibility indefinitely, possibly automatically.
One possible usage style may be to create a website for your department or lab elsewhere and pull specific real-time generated content from the Minerva web services via javascript includes, redirects, and API’s.

Usage is unlimited for the purposes of interaction with Minerva components and viewing science.

WARNING WARNING WARNING: Be careful! Content, executables, scripts, symlinks, applications, etc within the www/ folder may be ( or are ) world accessible. Scripts and applications launched via Apache in that folder run as your user! They can access any data (including your groups’ /project data), delete data, archive data, submit jobs, cancel jobs, email people, etc as your user. You are responsible for any actions taken on your behalf!

Getting Started with User-based Web Services

Each user, by default, has a web services site. A specific user’s site is located at users.hpc.mssm.edu/~userid The Document Root for a user’s site is within their home folder in a folder called www.
Thus, https://userid.u.hpc.mssm.edu/~username/www

Step 1:
If this folder does not exist in your home directory, you should create it.

$ mkdir ~/www

Step 2:
Place content in the www folder.

$ cat > ~/www/index.html <<EOF

Hello World from my website.

EOF

Step 3:
Test your web services. Point your browser to https://userid.u.hpc.mssm.edu

Supported Scripting Languages

The following languages are supported: PHP, Python, Perl, and CGI (all others).
The Default extensions for each language:

Language	Extension	Index
PHP 7.3.14	.php	index.php
Python 3.8.2 (wsgi)	See Below	See Below
Perl 5.16.3	.pl	index.pl
CGI	.cgi	index.cgi

PHP:

Web Services supports PHP. Files must end in a .php extension. The service is based on PHP 7.3.14. The following modules / patches are installed: gd ldap json mysql pear xml pear-DB pgsql mbstring pecl-imagick imap snmp suhosin zlib.
Clean URL’s may be supported through a .htaccess modification. Please read the PHP and your application’s documentation for how it should be done for your application.

An example of “Hello World” in PHP:

<?php

print “Hello World”;

?>

Python:

wsgi method:

The WSGI method of hosting python programs is new and more elegant compared to the mod_python method. It is now the standard for many frameworks, such as Django, Pyramids, etc.
( Please note this is WSGI-Script, not the daemonized / threaded style. Most apps should be able to handle this correctly. )
The Key to hosting a WSGI app is creating a directory in which you put the WSGI file, which will serve as the root of the app.
For example, if you want to host an app at https://userid.u.hpc.mssm.edu/myapp such that the URLs https://userid.u.hpc.mssm.edu/myapp/subpage and https://userid.u.hpc.mssm.edu/myapp/subpage/anotherthing are all handled by your WSGI app, then, do the following:
Create the folder by your main app name inside the root of your www folder:

mkdir ~/www/myapp

Then add a .htaccess file to declare that all files in that folder will be WSGI files. Put the following lines in a .htaccess file, like so:
Example of how to edit:

vi ~/www/myapp/.htaccess

Line to put in the file:

SetHandler wsgi-script

At this point, if you create a test WSGI app in your folder it will work, for example:

$ cat wsgione.wsgi
def application(environ, start_response):
status = '200 OK'
output = b’Hello World!’
response_headers = [(‘Content-type’, ‘text/plain’),(‘Content-Length’,
str(len(output)))]
start_response(status, response_headers)
return [output]
$

The app above would be available at https://userid.u.hpc.mssm.edu/myapp/wsgione.wsgi URLs beyond that would also be handled by it, for example: https://userid.u.hpc.mssm.edu/myapp/wsgione.wsgi/one/two/three
(In this case, of course, nothing is done with those URLs, but something could be in your app, like with Django).
Of course, this URL is not very clean, so we can move the file from wsgione.wsgi to wsgione ( just removing the .wsgi ) and it will still work. Thus, you could host many apps in one folder, for example, /apps/app1, /apps/app2 .. etc etc.

That too, still may not be ideal for some use cases, so finally, it is possible to make a main top-level folder redirect into an app, however, ideally you should make a folder that only contains the one WSGI file and the .htaccess file.
You will need to call your WSGI file index.wsgi.
Then set your .htaccess file like so:

AddHandler wsgi-script .wsgi
DirectoryIndex index.wsgi
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.wsgi/$1 [QSA,PT,L]

This will rewrite https://userid.u.hpc.mssm.edu/myapp/one/two/three into the index.wsgi file and pass-in the URL parameters for the app to handle.

Note: mod_python is no longer supported in our new webserver.

Some demos on setting up your first python flask and dash app (campus network or VPN needed to access)
https://gail01.u.hpc.mssm.edu/flask_demo/
https://gail01.u.hpc.mssm.edu/dash_demo/
Code is at https://gail01.u.hpc.mssm.edu/code/

Perl and CGI:

Perl is supported through mod_perl and CGI. Simply end your file with a .pl or .cgi extension and ensure it is executable on the filesystem. (for example do a chmod +x test.pl to make it executable). As the Perl is being executed as CGI, you must put a “shebang” at the beginning of your file to point to the perl executable. The service is based on Perl 5.10.1
The following Perl modules are installed for sure, in addition to possibly others: perl-DBI perl-URI perl-IO-Socket-SSL perl-Net-DNS perl-HTML-Tagset perl-Archive-Tar perl-Net-IP perl-DBD-MySQL perl-Socket6 perl-Digest-SHA1 perl-IO-Socket-INET6 perl-Compress-Zlib perl-DBD-Pg perl-HTML-Parser perl-IO-Zlib perl-BSD-Resource perl-SGMLSpm perl-libwww-perl perl-Digest-HMAC perl-Net-SSLeay perl-Pod-POM perl-Pod-Simple perl-Date-Calc

An example Perl Hello World:

#!/usr/bin/perl
print “Content-type: text/html\\n\\n”;
print “Hello, world!\\n”;

As Perl here is executed as if it was CGI, you may instead create any executable file, including shell script (bash, csh, ksh) or even an executable binary with an appropriate name to execute it.

Authentication

It may be desirable to force visitors to authenticate before viewing a portion (or all) of your website.
Authentication is supported in two ways: System Auth or Password File

System Auth

System Auth will authenticate users using their Mt Sinai / Minerva Credentials, the same as you ssh login to Minerva. For the campus-facing web servers, password auth is supported.

To enable System Auth, you should create a .htaccess file in the root of the directory structure you wish to protect.

The .htaccess file to use for System Auth is:

AuthType Basic
AuthName Your-Site-Name
AuthBasicProvider external
AuthExternal pwauth
require valid-user

Password File

Password File will allow you to create a file with usernames and passwords combinations that will allow successful logins. Password File is not secure. It allows you to create usernames and passwords such as user “test” with password “test”. It’s not for serious authentication, however, it can be useful for light protection of folders that you just don’t want Google to accidentally run in to, for example. Please do not create bogus passwords such as test or password
To enable Password File auth, you will need to create a password file and create a .htaccess file in the root of the directory structure you wish to protect.

The password file should sit somewhere that is outside of the www/ folder. A good place may be in a folder in your home directory. If you desire, you can create a folder in your home directory for just storing these types of file, or possible a hidden folder in your home directory (one beginning with a dot)

To create a folder and password file and user named test:

$ mkdir ~/.htpasswords
$ htpasswd -c ~/.htpasswords/passdb1 test
New password:
Re-type new password:
Adding password for user test

The .htaccess file to use for Password File is:

AuthType basic
AuthName “private area”
AuthUserFile    “/hpc/users/user1/.htpasswords/passdb1”
Require            valid-user
Order allow, deny
Allow from all

Note: You should place the full and proper path to the password file in the AuthUserFile line.

General:
You may prefer some variations. For example:
Varing the line AuthName will show the user a different prompt at the login box.
Varing the require valid-user to require user user1 user2 will restrict access to user1 and user2 only. Similarly, require group group2 will restrict access to group2
Other variations are possible. Anything that may be supported in a .htaccess file is allowed (some restrictions). You can block based on remote IP etc.
Some Sites show many possible variations.

Please read the Apache documentation on Basic Auth and The Require Directive

Group-based Web Services

As described above, web services are currently only created, by default, for each user. For a group-based web service, or a custom-named web service, please submit a ticket by emailing hpchelp@hpc.mssm.edu with your request. Do note that the web service will need to run as some specific user, so please plan ahead as to which user should own the web service.
Similarly, custom web services (see below) on a per-user or per-group basis may use non-default domain names, however, it is a special request.

Domain Names

By default, the Scientific Computing group places all services below the hpc.mssm.edu domain. For automated purposes, we place the generated web service accounts below the u.hpc.mssm.edu domain (u for “users”). Although this provides a quick way for users to start working with web services, it may not be desirable a desirable URL for certain cases. Shorter domains typically look nicer. Long URL’s with usernames can be confusing.

The Scientific Computing group does not have the ability to acquire or control URL’s outside of our own domain. However, your user / group may go through either Mt Sinai IT to acquire some other *.mssm.edu (or similar) domain and/or purchase a domain from a public entity, such as Network Solutions, ENOM, or Godaddy. You will also need some sort of Domain Name Hosting services. You can then create an A or CNAME record to point to our web hosting IP address(es). We can configure your account to accept requests for these additional domains.

The Scientific Computing group and the web services do not provide Domain Name Hosting or Email Services.

SSL

SSL with domain names outside the hpc.mssm.edu domain presents some challenges. Each unique domain will require an additional IP Address. This will need to be evaluated on a case-by-case basis to see if there are remaining addresses to address the need. Please plan ahead before making any purchases to ensure the service will function correctly.
Other hybrid variations may provide useful. For example, the non-secure content can be on the custom domain and the secure content can be on the default u.hpc.mssm.edu domain. The two sites can pass the user back and forth as needed.

GPG

What Is GPG
GPG stands for GNU Privacy Guard and is a set of encryption tools to protect data transmitted from one person to another over an untrustworthy medium such as the internet. Used properly it can ensure that any third-party who unintentionally obtains a file will be unable to decrypt the contents.

Usage of GPG on Minerva
On Minerva GPG is best suited for use when publishing files on user/DMZ pages for access by people outside of Minerva. Within Minerva it is more secure to use file permissions or Access Control Lists (ACL’s) to allow other users to access your files.

Limitations of GPG on Minerva
GPG cannot be used to distribute PHI/PII or any other form of controlled data. Files containing PHI/PII are strictly not permitted on user web pages even if they are encrypted.

How does GPG Work

Using GPG on Minerva

Step 1: Generate a Key Pair

On Minerva run `gpg –gen-key`. This will begin a series of prompts which are explained below.
For the first prompt select option “(1) RSA and RSA (default)”. This option determines which type of keys to generate.
For the second prompt press enter to select the default size of the key (2048 bit). For added protection from brute force you can enter 4096.
- Do not use any value less than 2048.
The third option determines how long the key will be good for. Files may be encrypted using temporary keys which will expire after a defined amount of time. Most users will want a key that will never expire.
- Enter 0 to generate a key with an infinite lifetime, or use the formatting specified in the prompt to enter the duration for which you want the key to be valid.
The next three prompts request your real name, email address, and an optional comment.
- After entering these, review the entries and when ready select “(O)kay” by entering O.
A prompt will appear asking for a passphrase. This is a password that is required to unlock the key.
- *DO NOT* use your Minerva password or any password you use on any other service for this purpose. This should be a unique password. A second prompt will ask you to verify the password.
- *DO NOT* generate keys with no passphrase as these offer no security.
There will be a message that begins with “We need to generate a lot of random bytes.” The key is now being generated which can take several minutes to complete.
Your keys have now been created and can be verified by running `gpg –list-keys`

Step 2: Encrypt a File

Run the command `gpg -e` followed by the name of the file you want to encrypt.
A prompt will ask you to “Enter the user ID”. This should be an email address. You may use your own email address.
When all recipients have been entered press RETURN to continue.

You will now have a file with the same name but ending in .gpg in the directory. For example, if you encrypt “private-file.txt” you will have a new “private-file.txt.gpg” which has been encrypted.

Step 3: Decrypting a File

Run the command `gpg` followed by the name of the encrypted file.
To decrypt the file “private-file.txt.gpg” you would enter `gpg private-file.txt.gpg`.
You will be prompted to enter the passphrase you configured in Step 1. If you enter the wrong passphrase the file will not be able to be decrypted.

Distributing Keys
Any person who downloads a file you have encrypted will need a copy of the public key that you generated in Step 1. This can be sent via any medium.

A best practice is to distribute the key directly to intended recipients via email.

Obtaining the Public Key to Distribute

Run `gpg –export -a –output KEY_NAME.asc EMAIL` where KEY_NAME will be the filename and EMAIL is the email address you entered as part of Step 1.

This will create a file with the specified name that contains the key in plain text. You may attach this file to email or copy/paste the file contents into the email body. The recipient will simply need to save a copy of this data to a file which can be imported into GPG on their computer.

Importing Keys
There are many options for use in GPG that relate to what type of encryption is done, and ways to automate the entire process by populating prompt data from the command line. Please view the man page using `man gpg` or accessing

More Information
There are many options for use in GPG that relate to what type of encryption is done, and ways to automate the entire process by populating prompt data from the command line. Please view the man page using `man gpg` or accessing
https://www.gnupg.org/documentation/manpage.html

The GPG webpage has extensive information about usage of GPG, best practices, and advanced topics. Please visit https://www.gnupg.org/index.html