Software and Packages
- Software and Applications
- Schrodinger Suite
- MATLAB, Simulink and MATLAB Distributed Compute Server
- Jupyter Notebook
Queues and Resources
- LSF Queues And Policies
- GPU Etiquette
- Access TSM with GUI
- Access TSM with Command Line
- Checkpoint Restart
- Disaster Recovery Plan
Important data stored on Minerva can be protected by archiving the data in the IBM Tivoli Storage Management (TSM) system (renamed as Spectrum Protect after v8.1.7) installed on a Minerva server.
The TSM system will create two long-term tape copies of your data. One copy will be stored in the IBM tape library that is part of the Minerva complex and be available for rapid recall; the second copy will be stored off-site in a secure data vault for disaster protection. Data on both copies will be encrypted to deter unauthorized access.
Data Retention Policy
The archived data will have a retention time of 6 years and will then be deleted, please check the expiration date of your archived files. This is the responsibility of the user!
How to Access TSM
TSM client or Spectrum Protect client v8.1.7 is installed on all internal login nodes, i.e., minerva13, minverva14 and on the data nodes. Users can issue archive commands, dsmc or dsmj, from either of the internal login nodes.
TSM cannot be accessed from external login nodes
Trying to use one of these commands in the external login node will result in a “Command not found” response”
Data that is archived is grouped in the TSM system by nodes. A node is an abstraction and can be physically many things. On Minerva, each user is considered a node to the TSM system and the node identity for each user is the userid.
The TSM system can be accessed via either a GUI or the command line. The command line mode is particularly useful when archiving large datasets in that it can be issued using the screen command. The screen can then be detached and the command can run unattended for the hours it may take to archive the data.
Tar small files before archive.
Because all the files that are archived are entered into a database, to prevent overflowing this database we ask that you first use the tar command to create a unix tar archive of bundles of small files and then archive the tar file to TSM. For information about tar comparesee the man pages ( man tar ) on Minerva or check out this link.
Command Line and Screen is recommended for large data archiving or retrieval.
It is not recommended to use the GUI for large data archiving or retrieval because you would have to keep the interactive session open until all of the data are archived. Instead, start a “screen” session and issue the line command to perform the archiving. You can then detach the screen session and the command will continue executing.
Long retrieval time is expected.
Due to the large amount of archived data and number of tapes, most of the tapes are sitting on the cabinet instead of the TSM library. Our operators get email notifications when you issue a retrieve request, and they will fetch the desired tape and load it into the library. This process is manual and the responding time for the operators is one and half hours. This is the time when the process is showing ” [ -]” but without progressing.
Once the tape is loaded into the TSM library, the library will automatically mount the tape and read its data. This data transfer time is reasonably fast.
Note that tape check-in errors may also occur when there are simultaneous retrieval requests. When you get error such as “data is unavailable”, please send in a ticket and we are happy to resolve it for you.
Warning: If one specifies that files should be deleted automatically after archive and then subsequently deletes the archive object the data will be permanently lost.
Frequently Asked Questions
For a more extensive discussion of using TSM, see IBM Spectrum Protect Manual
Can I use TSM for backup
We don’t recommend users to use TSM for Backup unless it is really really needed for your work. TSM archive can meet the needs of most users, to keep copies for the data that is not needed for a while. Please use TSM archive when you can. If you do have important files constantly changing and you really need to backup them, please complete the Minerva TSM Backup Request Form. We will contact you to discuss it. Otherwise, if you run TSM backup without making a request, the process will be killed.
How to access other users’ archived files
What to do if you receive Permission Denied
During retrieval you may encounter the following error message: ANS1590W I/O error writing file attribute: security.selinux for: /dir-to-file/file. errno = 13, Permission denied.
This error is due to TSM failed to write some extended security information for your file. We have that feature turned off. Same error message may also appear from untarring tarballs that you imported from another installation. Despite this error message that TSM throws, the retrieval will continue and the data will not be impacted.
Error message with file currently unavailable on server
Users are able to query the archived file, but the retrieve fails with error message:
11/14/17 18:03:31 ANS4035W File ‘filename’ currently unavailable on server.
This is due to either the tape information is not correct or it did not load into the Library correctly, please send in a ticket and we will fix it for you.
Error message with file write protected and unable to retrieve to the disk
Users retrieving a file but get error message stating that the file is “write protected” and the file is unable to write to the designated directory. TSM may ask for options, but choice with “Force an overwrite for this object” does not work. User may also specify a different destination directory for retrieval, which the user have the write permission, but same error exits.
This normally happens due to the write permission bit is taken away for the file when it is archived. Normally, this kind of file can be written to /hpc and /tmp, but not GPFS file systems such as /sc/hydra since GPFS has high security settings on the top layer. It is not possible to change the file permission once it is archived but we can provide workaround.
If the file is small in size, please check the space in your home dir (/hpc/users/userid), or /tmp directory on the login nodes (use “df -h” and look for available size in “/” dir). Retrieve to these two directories first and move out to your desired directory. Please constantly monitor the size of the /tmp and do not use over 70%.
If you have a large size of file to retrieve and can not fit in these two directories, please send in a ticket and the admins will retrieve for you.
Error message with exceeded maximum number of mount points
Users may get following error using TSM:
ANS0326E This node has exceeded its maximum number of mount points
This is due to that a maximum of 4 TSM connections are allowed at a time for each node/user. Both dsmc and dmsj commands counts for the TSM connection. Please limit your concurrent TSM connection to 4 and the error will go away. You may also want to check whether there is orphan TSM process from your earlier tsm activities. Terminate this orphan process will also free up more TSM connections.
Can I keep my archived data over the 6 years’ retention time?
For files needed past their expiration date, we suggest you retrieve those files and archive them again. This is good practice for two reasons: