Quantcast
Channel: Intel® Fortran Compiler
Viewing all articles
Browse latest Browse all 3270

Text file to allocatable string

$
0
0

Fortran gurus:

I'm looking for the fastest, safest, and most portable way to read the entire contents of a text file into a Fortran allocatable string.  Here's what I've come up with:

    subroutine read_file(filename, str)

    implicit none

    character(len=*),intent(in)  :: filename
    character(len=:),allocatable,intent(out) :: str

    !parameters:
    integer,parameter  :: n_chunk = 256      !chunk size for reading file [arbitrary]
    character(len=*),parameter :: nfmt = '(A256)'    !corresponding format statement
    character(len=1),parameter :: newline = new_line('')

    integer :: iunit, istat, isize
    character(len=n_chunk) :: chunk
    integer :: filesize,ipos
    character(len=:),allocatable :: tmp

    !how many characters are in the file:
    inquire(file=filename, size=filesize)  !is this portable?

    !initialize:
    ipos = 1    !where to put the next chunk

    !preallocate the str array to speed up the process for large files:
    !str = ''
    allocate( character(len=filesize) :: str )

    !open the file:
    open(newunit=iunit, file=trim(filename), status='OLD', iostat=istat)

    if (istat==0) then

        !read all the characters from the file:
        do

            read(iunit,fmt=nfmt,advance='NO',size=isize,iostat=istat) chunk

            if (istat==0) then

                !str = str//chunk
                str(ipos:ipos+isize-1) = chunk
                ipos = ipos+isize

            elseif (IS_IOSTAT_EOR(istat)) then

                if (isize>0) then
                    !str = str//chunk(1:isize)//newline
                    str(ipos:ipos+isize) = chunk(1:isize)//newline
                    ipos = ipos+isize+1
                else
                    !str = str//newline
                    str(ipos:ipos) = newline
                    ipos = ipos + 1
                end if

            elseif (IS_IOSTAT_END(istat)) then

                if (isize>0) then
                    !str = str//chunk(1:isize)
                    str(ipos:ipos+isize) = chunk(1:isize)//newline
                    ipos = ipos+isize+1
                end if

                exit

            else
                stop 'Error'
            end if

        end do

        !resize the string
        if (ipos<filesize+1) str = str(1:ipos-1)

        close(iunit, iostat=istat)

    else
        write(*,*) 'Error opening file: '//trim(filename)
    end if

    end subroutine read_file

Some notes/questions about this:

  1. This routine will read the 100 MB file at https://github.com/seductiveapps/largeJSON in about 1 sec on my PC.
  2. Is it really portable to use the SIZE argument of INQUIRE to get the number of characters? I notice that the string I end up with is somewhat smaller than this value, but that could be due to #3.  What is the portable way to get the file size in number of characters (I'd like it to also work on other non-ifort compilers, as well as on other platforms).
  3. I don't think this way preserves the Windows line breaks (if present), since it essentially reads it line by line and then inserts the newline character. The string I end up with is smaller (which is why I'm trimming it at the end).  Is there a way to read it in a way that includes the line breaks as is?
  4. The original (naive) version of the routine (see the commented-out bits, e.g., str = str//chunk) is extremely slow and also causes stack overflows for very large strings.  The slowness makes sense to me due to all the reallocations, but I didn't expect it to cause stack overflows.  Is that to be expected?
  5. Any other improvements that anyone can see? 

 


Viewing all articles
Browse latest Browse all 3270

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>