Running R script with R objects (RDS)

Hi there,

I am new to galileo.
I am trying to run a piece of R code which reads in R objects - however when I try to run the script the files (ending .RDS) are not listed in the directory despite having uploaded them and they are in the same directory - hence the code will not run.
It seems the problem is limited to RDS files as other files .e.g .csv files are listed when I print what is in the directory.

The error is: Error in gzfile(file, “rb”) : cannot open the connection
Calls: readRDS -> gzfile
and
cannot open compressed file ‘filename.RDS’, probable reason ‘No such file or directory’
Execution halted

Could you help me with this?

Thank you in advance,
Viola

Hi @ky18705, I’m taking a look at this now to see if I can reproduce the problem. I’ll get back to you shortly. Also, if you could share the code snippet here that is throwing the error, that would help a lot.

Hi Todd,

Thank you so much - the error seems to get thrown in the readRDS line of the beginning of my loop:

regions = c(“SW”, “SE”, “NW”, “NE”)
print(paste0(regions))
print(getwd())
print(list.files())
datalist = list()
print(“starting loop”)
for (i in 1:4){

region_str = regions[i]

test1 = readRDS(paste0(region_str,"_model_rf_cforest_seed1_2percent.RDS")) …

But the files are not listed when I print all files prior to the loop.
Thank you so much,
Viola

Thanks for the snippit @ky18705, I made a simple R example that reads in a .RDS file as you do above and it ran properly. Try to check the byte count on the .RDS file in your Mission file viewer.

Make sure the the file size looks correct. If it says 0kB or something that looks obviously too small, then the file probably didn’t upload properly and is actually being stored as an empty file. This would lead to the behavior you are seeing where the files don’t show up when you call print(list.files()).

Let me know what you find here.

Hi Todd,

Indeed, it seemed the files were not uploaded properly - though they did show the size in MB was >>0MB , I deleted the files and re-uploaded them without refreshing the page and I was able to get the script running beyond this point in the code

Thank you for you help!!
Viola

Awesome! Thanks for reaching out to us on the forum. If you need anything going forward, you can always tag me or the Hypernet group to notify us.

@todd and @Hypernet team - Thank you so much for your help last week and this week.
Unfortunately, I have another problem. I am running a loop - looping through data output from a random forest analysis.
The files are all different sizes - and on the largest RDS files (~500MB) the remaining code in the loop does not run, but no error is thrown. The code simply stops and the job is “completed”.

It seems the issue is with line when reading in the file (read.RDB). All the files are now fully uploaded and printing to screen when prompted.
Does the system have a maximum file that it can read in? - I have tried it a bunch of times now (5+ times).
My work is being sent to Andromeda-21.

Kind regards,
Viola

Hey @ky18705, currently, individual jobs are limited to 8GB of RAM. If the sum total of your datafiles exceeds 8GB it would cause the behavior you are seeing. If this is the case, I can bump your memory cap up so you can continue your work.

Dear @todd,

overall my datafiles do not exceed 8GB. But when loading this specific RDS data it does increase the RAM but not 8GB to load on my personal laptop. But I am can’t tell for sure how much the additional processes in the loop require. I am sorry I am not a computer/RAM expert. Is it possible to try increasing the RAM and seeing what happens, even for just a week or so? I appreciate that this may not be possible.

Cheers,
Viola

Hey @ky18705, just letting you know that I upped your RAM limit by an additional 4GB per job.