Skip to Main Content

Publish: Data Storage Hardware and Services

Data Storage Hardware and Services

Data Storage

Network Attached Storage

Network attached storage devices are boxes that contain both storage and the hardware needed to manage the storage. They can be thought of as a small computer with lots of storage. Using a NAS to store your research data has many benefits. Because they are internet accessible, it is easy to centralize data collected on different instruments and to access data for later analysis. Most models contain multiple hard drives and can be set up with RAID to protect against data loss in case of a hard drive failure. NAS devices are generally affordable ($300-$1500 depending on the storage space needed) and is usually cheaper than purchasing cloud storage over 4-5 years. We currently provide instructions for setting up Synology NAS devices, but many manufacturers make a similar product. If you've got a different NAS, let us know at data@caltech.edu and we can work on putting together setup instructions.

See the instructions for setting up a Synology NAS below.

Interested in trying out a NAS? Send an email to data@caltech.edu to get access to our demo NAS!

Box.com

Caltech IMSS manages a campus site license for Box.com. Campus users get 50 GB of free storage.  Box.com is a good resource for storing backup copies of data and syncing between computers, but should not be used as a primary data storage location. Note that Box.com has a 5 GB individual file limit, and lacks a Linux sync client.  Continued availability of Box.com is dependent on IMSS and Box.  A comparison of IMSS provided file storage systems is available at http://imss.caltech.edu/services/collaboration-storage-backups/storage-comparison

Have questions about other storage services? Send an email to data@caltech.edu.

High Performance Computing (HPC) Resources

Caltech Resources

IMSS now provides centralized HPC resources via a campus cluster.  There is a per-hour charge for computing, and research groups can make an investment to get additional computing time.  Find all the details at hpc.caltech.edu

XSEDE

XSEDE is a National Science Foundation funded nationwide high performance computing resource. Researchers can request time on more than 10 national supercomputers, visualization resources, storage systems, and scientific gateways, also listed below. A separate NSF grant is not required to gain access to these resources. Caltech users interested in testing one of these systems can contact the Caltech Campus Champion, Tom Morrell at tmorrell@caltech.edu, for trial access. Faculty, Postdoctoral Researchers, and NSF Graduate Research Fellows can submit a startup allocation which provides up to 50,000 compute hours to test XSEDE resources. The startup allocation request process is very straightforward and enables users to quickly access computing resources. After one year or when the startup resources are exhausted, researchers can submit a more thorough research allocation proposal.

Current XSEDE Resources (2020-03)

Labeled with Host, Node specifications, Max queue time

Traditional

Comet, SDSC, 24 Core; 128 GB RAM; 320 GB SSD, 2 days 
Stampede2, TACC, select KNL or SKX: 68 or 48 Core; 96 or 192 GB RAM; 107 or 144 GB SSD, 4 days 
Bridges, PSC, 28 core; 128 GB RAM; 8 TB Storage, 2 days (can also schedule segments of a node)

GPU

Comet, SDSC, NVIDIA Maxwell K80, 2 days 
Bridges, PSC, NVIDIA K80 or P100, 2 days

Large Memory

Bridges Large, PSC, 3 or 12 TB RAM; 16 or 64 TB Storage, 4 Days
Comet, SDSC, 64 Core 1.5 TB; RAM 400 GB SSD, 2 days

Virtualized/Distributed

Jetstream, Indiana/TACC, Can spin up various sized custom imaged environments
Open Science Grid, Distributed computing for smaller jobs (single thread, < 2 GB memory, 1-12 hour execution, <10 GB storage)

Cyverse

Cyverse (formerly iplant) is another NSF-funded cyberinfrastructure project that provided computing resources, primarily targeted at life science researchers. They offer free access to Atmosphere, a cloud-based computing resource where you can spin up computing resources with specific images.  A basic allocation is available by registering, and additional allocations require an application.  

Have questions about high performance computing resources? Send an email to data@caltech.edu.

Synology NAS Instructions

Tested on a Synology DS216j in Summer 2016.  Should apply to all Synology Products that use DSM 6.0.  Updated 2018-04. 

Setup Instructions

  1. Make sure you have a Philips screw driver available.  Open package and slide off cover to reveal hard drive trays
  2. Gently slide hard drive into place, making sure that the data connectors are lined up.  Secure with four screws.
  3. Repeat for all other drives, replace cover and secure with two of the smaller screws (should be in a separate package)
  4. Get a hostname or static IP address for the NAS.  While there are other ways of connecting, a static IP is the easiest.  The MAC address for the device is printed on a sticker near the Ethernet port.  If your department runs its own network, contact them and provide the MAC address for your new device.  If IMSS is responsible for your network, you'll want to make up a name for your device like "name.caltech.edu".  Check to see whether your name is currently in use by someone else by following the instructions at http://www.imss.caltech.edu/services/wired-wireless-remote-access/wired-network/check-available-hostnames. Then make your request at http://help.caltech.edu (request type IMSS-->Network, Wireless & Remote Access-->Host and Address Requests) and provide the contact information for the individual responsible for the NAS, the building and room of the NAS, and the desired hostname.  Remember that the address must be used every two months or IMSS can take it back.
  5. Once you have a static IP address, power on the NAS and connect it to the campus network.  Access the NAS by going to a web browser and typing name.caltech.edu, where name is the specific hostname you requested.  You can also type the specific IP address (eg. 131.215.226.19).  If this doesn't work, you might have to connect to your NAS using the Synology Assistant application.  Download tha application from https://www.synology.com/en-us/support/download on a computer that is connected to the same local network as the NAS.  You can then find and connect to your NAS.  Once you complete setup steps 6-10, you'll need to manually assign the correct IP address.  Go to Control Panel/Network and select Network Interface in the top menu.  Then click LAN, click the Edit button at the top, select manual configuration, and enter the correct IP Address.
  6. Follow the prompts to install the DiskStation software and set up your new disks.  
  7. Set up your administrator account.  It's best set the server name to the name you selected in step 4.  Make sure the password you select is strong (long, with letters numbers, and special characters).  I choose not to share the location of my NAS with Synology
  8. Skip setting up an account with Synology.  It's not really useful in this setup.
  9. You can install the recommended packages now or not, you can always do this later on.
  10. Set up automated updates.  This is the best way to stay on top of security patches.

You should now be in the main control panel of your NAS.  You're ready to administer your NAS.  A few things you might want to set up are below.

Folders

  1. Folders are where you store data and help organize and control access to data.  Click on Control Panel and Shared Folder.
  2. Click Create
  3. Pick a name for the folder.  I generally leave the other options unchecked, but you can consider these features depending on your specific needs
  4. Set user permissions for the folder.   If you don't have users yet, just click OK

Groups

  1. Groups are useful if you want to allow access to files or folders for a specific set of users or control how much of the storage space a bunch of users can access.  Click on Control Panel and Group.
  2. Click Create
  3. Make a group name
  4. Set the shared folders that this group will have access to
  5. Set the total amount of storage space the group will be able to use.  Clicking next will allow the group to use the entire NAS
  6. Set any specific applications that the group can access.  The defaults here are usually good.
  7. Assign upload speed limits.  Don't worry about this unless you've got a lot of users and bandwidth constraints.
  8. Confirm the settings and click Apply

User Accounts

  1. Click on Control Panel and User.  
  2. Click Create
  3. Enter a Name and pick a strong password (long, with letters numbers, and special characters).
  4. Select the groups the user will be part of (users is the required default group)
  5. Select the folders the user has should have access to.  If the group selected in step 4 already has access to folders, this will be noted in the second column.
  6. Set the storage quota for the user.  Setting group quotas generally works better.  Just click Next to give the user an unlimited quota.
  7. Assign permissions for applications.  The defaults will typically be correct in most cases.  Click Next
  8. Assign upload speed limits.  Don't worry about this unless you've got a lot of users and bandwidth constraints.  Click Next
  9. Check to make sure the settings look good and click Apply

Accessing the NAS from a Windows Computer

  1. Click on Control Panel and File Services
  2. The Enable Windows file service box should already be checked.  If not, check it
  3. Look in the Blue box for your connection information.  It should be \\hostname from step 7 in the setup instructions.

Accessing the NAS from a Mac Computer

  1. Click on Control Panel and File Services
  2. The Enable Mac file service box should already be checked.  If not, check it
  3. Look in the Blue box for your connection information.  It should be afp://hostname from step 7 under the instructions.  If you want to access a specific shared folder, your address will be afp://hostname/folder
  4. On your mac click on Finder, Go, and Connect to Server
  5. Paste the address from Step 3. into the address box and click connect
  6. Enter your username and password
  7. The shared folder now appears under "Shared" in finder.  You can save files to this location like any other location on your computer.

Box.com Integration

  1. If you're a faculty or staff member, request unlimited storage by going to https://help.caltech.edu (request type IMSS > Data Storage & Backup > Request Additional File Storage > Box)
  2. On the NAS, go to Package Center
  3. Find "Cloud Sync" and install the package
  4. Follow the instructions to link your Box.com Caltech account
  5. You'll want to connect your shared folder with a folder on Box.  By default, if you create a new folder it won't sync with your desktop client, so you can store both research data and personal data.  You can set up the syncing direction.  Bidirectional means you can modify research data on Box.com or upload local changes only where new files on the NAS get copied to Box.  Upload only is the safest option.

Troubleshooting

Do you have saved credentials that your Mac is using?  Explicitly include your username by connecting to afp://name:*@131.215.226.19/folder