package IO::ScalarArray;
=head1 NAME
IO::ScalarArray - IO:: interface for reading/writing an array of scalars
=head1 SYNOPSIS
Perform I/O on strings, using the basic OO interface...
use IO::ScalarArray;
@data = ("My mes", "sage:\n");
### Open a handle on an array, and append to it:
$AH = new IO::ScalarArray \@data;
$AH->print("Hello");
$AH->print(", world!\nBye now!\n");
print "The array is now: ", @data, "\n";
### Open a handle on an array, read it line-by-line, then close it:
$AH = new IO::ScalarArray \@data;
while (defined($_ = $AH->getline)) {
print "Got line: $_";
}
$AH->close;
### Open a handle on an array, and slurp in all the lines:
$AH = new IO::ScalarArray \@data;
print "All lines:\n", $AH->getlines;
### Get the current position (either of two ways):
$pos = $AH->getpos;
$offset = $AH->tell;
### Set the current position (either of two ways):
$AH->setpos($pos);
$AH->seek($offset, 0);
### Open an anonymous temporary array:
$AH = new IO::ScalarArray;
$AH->print("Hi there!");
print "I printed: ", @{$AH->aref}, "\n"; ### get at value
Don't like OO for your I/O? No problem.
Thanks to the magic of an invisible tie(), the following now
works out of the box, just as it does with IO::Handle:
use IO::ScalarArray;
@data = ("My mes", "sage:\n");
### Open a handle on an array, and append to it:
$AH = new IO::ScalarArray \@data;
print $AH "Hello";
print $AH ", world!\nBye now!\n";
print "The array is now: ", @data, "\n";
### Open a handle on a string, read it line-by-line, then close it:
$AH = new IO::ScalarArray \@data;
while (<$AH>) {
print "Got line: $_";
}
close $AH;
### Open a handle on a string, and slurp in all the lines:
$AH = new IO::ScalarArray \@data;
print "All lines:\n", <$AH>;
### Get the current position (WARNING: requires 5.6):
$offset = tell $AH;
### Set the current position (WARNING: requires 5.6):
seek $AH, $offset, 0;
### Open an anonymous temporary scalar:
$AH = new IO::ScalarArray;
print $AH "Hi there!";
print "I printed: ", @{$AH->aref}, "\n"; ### get at value
And for you folks with 1.x code out there: the old tie() style still works,
though this is I<unnecessary and deprecated>:
use IO::ScalarArray;
### Writing to a scalar...
my @a;
tie *OUT, 'IO::ScalarArray', \@a;
print OUT "line 1\nline 2\n", "line 3\n";
print "Array is now: ", @a, "\n"
### Reading and writing an anonymous scalar...
tie *OUT, 'IO::ScalarArray';
print OUT "line 1\nline 2\n", "line 3\n";
tied(OUT)->seek(0,0);
while (<OUT>) {
print "Got line: ", $_;
}
=head1 DESCRIPTION
This class is part of the IO::Stringy distribution;
see L<IO::Stringy> for change log and general information.
The IO::ScalarArray class implements objects which behave just like
IO::Handle (or FileHandle) objects, except that you may use them
to write to (or read from) arrays of scalars. Logically, an
array of scalars defines an in-core "file" whose contents are
the concatenation of the scalars in the array. The handles created by
this class are automatically tiehandle'd (though please see L<"WARNINGS">
for information relevant to your Perl version).
For writing large amounts of data with individual print() statements,
this class is likely to be more efficient than IO::Scalar.
Basically, this:
my @a;
$AH = new IO::ScalarArray \@a;
$AH->print("Hel", "lo, "); ### OO style
$AH->print("world!\n"); ### ditto
Or this:
my @a;
$AH = new IO::ScalarArray \@a;
print $AH "Hel", "lo, "; ### non-OO style
print $AH "world!\n"; ### ditto
Causes @a to be set to the following array of 3 strings:
( "Hel" ,
"lo, " ,
"world!\n" )
See L<IO::Scalar> and compare with this class.
=head1 PUBLIC INTERFACE
=cut
use Carp;
use strict;
use vars qw($VERSION @ISA);
use IO::Handle;
# The package version, both in 1.23 style *and* usable by MakeMaker:
$VERSION = "2.111";
# Inheritance:
@ISA = qw(IO::Handle);
require IO::WrapTie and push @ISA, 'IO::WrapTie::Slave' if ($] >= 5.004);
#==============================
=head2 Construction
=over 4
=cut
#------------------------------
=item new [ARGS...]
I<Class method.>
Return a new, unattached array handle.
If any arguments are given, they're sent to open().
=cut
sub new {
my $proto = shift;
my $class = ref($proto) || $proto;
my $self = bless \do { local *FH }, $class;
tie *$self, $class, $self;
$self->open(@_); ### open on anonymous by default
$self;
}
sub DESTROY {
shift->close;
}
#------------------------------
=item open [ARRAYREF]
I<Instance method.>
Open the array handle on a new array, pointed to by ARRAYREF.
If no ARRAYREF is given, a "private" array is created to hold
the file data.
Returns the self object on success, undefined on error.
=cut
sub open {
my ($self, $aref) = @_;
### Sanity:
defined($aref) or do {my @a; $aref = \@a};
(ref($aref) eq "ARRAY") or croak "open needs a ref to a array";
### Setup:
$self->setpos([0,0]);
*$self->{AR} = $aref;
$self;
}
#------------------------------
=item opened
I<Instance method.>
Is the array handle opened on something?
=cut
sub opened {
*{shift()}->{AR};
}
#------------------------------
=item close
I<Instance method.>
Disassociate the array handle from its underlying array.
Done automatically on destroy.
=cut
sub close {
my $self = shift;
%{*$self} = ();
1;
}
=back
=cut
#==============================
=head2 Input and output
=over 4
=cut
#------------------------------
=item flush
I<Instance method.>
No-op, provided for OO compatibility.
=cut
sub flush { "0 but true" }
#------------------------------
=item fileno
I<Instance method.>
No-op, returns undef
=cut
sub fileno { }
#------------------------------
=item getc
I<Instance method.>
Return the next character, or undef if none remain.
This does a read(1), which is somewhat costly.
=cut
sub getc {
my $buf = '';
($_[0]->read($buf, 1) ? $buf : undef);
}
#------------------------------
=item getline
I<Instance method.>
Return the next line, or undef on end of data.
Can safely be called in an array context.
Currently, lines are delimited by "\n".
=cut
sub getline {
my $self = shift;
my ($str, $line) = (undef, '');
### Minimal impact implementation!
### We do the fast thing (no regexps) if using the
### classic input record separator.
### Case 1: $/ is undef: slurp all...
if (!defined($/)) {
return undef if ($self->eof);
### Get the rest of the current string, followed by remaining strings:
my $ar = *$self->{AR};
my @slurp = (
substr($ar->[*$self->{Str}], *$self->{Pos}),
@$ar[(1 + *$self->{Str}) .. $#$ar ]
);
### Seek to end:
$self->_setpos_to_eof;
return join('', @slurp);
}
### Case 2: $/ is "\n":
elsif ($/ eq "\012") {
### Until we hit EOF (or exited because of a found line):
until ($self->eof) {
### If at end of current string, go fwd to next one (won't be EOF):
if ($self->_eos) {++*$self->{Str}, *$self->{Pos}=0};
### Get ref to current string in array, and set internal pos mark:
$str = \(*$self->{AR}[*$self->{Str}]); ### get current string
pos($$str) = *$self->{Pos}; ### start matching from here
### Get from here to either \n or end of string, and add to line:
$$str =~ m/\G(.*?)((\n)|\Z)/g; ### match to 1st \n or EOS
$line .= $1.$2; ### add it
*$self->{Pos} += length($1.$2); ### move fwd by len matched
return $line if $3; ### done, got line with "\n"
}
return ($line eq '') ? undef : $line; ### return undef if EOF
}
### Case 3: $/ is ref to int. Bail out.
elsif (ref($/)) {
croak '$/ given as a ref to int; currently unsupported';
}
### Case 4: $/ is either "" (paragraphs) or something weird...
### Bail for now.
else {
croak '$/ as given is currently unsupported';
}
}
#------------------------------
=item getlines
I<Instance method.>
Get all remaining lines.
It will croak() if accidentally called in a scalar context.
=cut
sub getlines {
my $self = shift;
wantarray or croak("can't call getlines in scalar context!");
my ($line, @lines);
push @lines, $line while (defined($line = $self->getline));
@lines;
}
#------------------------------
=item print ARGS...
I<Instance method.>
Print ARGS to the underlying array.
Currently, this always causes a "seek to the end of the array"
and generates a new array entry. This may change in the future.
=cut
sub print {
my $self = shift;
push @{*$self->{AR}}, join('', @_) . (defined($\) ? $\ : ""); ### add the data
$self->_setpos_to_eof;
1;
}
#------------------------------
=item read BUF, NBYTES, [OFFSET];
I<Instance method.>
Read some bytes from the array.
Returns the number of bytes actually read, 0 on end-of-file, undef on error.
=cut
sub read {
my $self = $_[0];
### we must use $_[1] as a ref
my $n = $_[2];
my $off = $_[3] || 0;
### print "getline\n";
my $justread;
my $len;
($off ? substr($_[1], $off) : $_[1]) = '';
### Stop when we have zero bytes to go, or when we hit EOF:
my @got;
until (!$n or $self->eof) {
### If at end of current string, go forward to next one (won't be EOF):
if ($self->_eos) {
++*$self->{Str};
*$self->{Pos} = 0;
}
### Get longest possible desired substring of current string:
$justread = substr(*$self->{AR}[*$self->{Str}], *$self->{Pos}, $n);
$len = length($justread);
push @got, $justread;
$n -= $len;
*$self->{Pos} += $len;
}
$_[1] .= join('', @got);
return length($_[1])-$off;
}
#------------------------------
=item write BUF, NBYTES, [OFFSET];
I<Instance method.>
Write some bytes into the array.
=cut
sub write {
my $self = $_[0];
my $n = $_[2];
my $off = $_[3] || 0;
my $data = substr($_[1], $n, $off);
$n = length($data);
$self->print($data);
return $n;
}
=back
=cut
#==============================
=head2 Seeking/telling and other attributes
=over 4
=cut
#------------------------------
=item autoflush
I<Instance method.>
No-op, provided for OO compatibility.
=cut
sub autoflush {}
#------------------------------
=item binmode
I<Instance method.>
No-op, provided for OO compatibility.
=cut
sub binmode {}
#------------------------------
=item clearerr
I<Instance method.> Clear the error and EOF flags. A no-op.
=cut
sub clearerr { 1 }
#------------------------------
=item eof
I<Instance method.> Are we at end of file?
=cut
sub eof {
### print "checking EOF [*$self->{Str}, *$self->{Pos}]\n";
### print "SR = ", $#{*$self->{AR}}, "\n";
return 0 if (*{$_[0]}->{Str} < $#{*{$_[0]}->{AR}}); ### before EOA
return 1 if (*{$_[0]}->{Str} > $#{*{$_[0]}->{AR}}); ### after EOA
### ### at EOA, past EOS:
((*{$_[0]}->{Str} == $#{*{$_[0]}->{AR}}) && ($_[0]->_eos));
}
#------------------------------
#
# _eos
#
# I<Instance method, private.> Are we at end of the CURRENT string?
#
sub _eos {
(*{$_[0]}->{Pos} >= length(*{$_[0]}->{AR}[*{$_[0]}->{Str}])); ### past last char
}
#------------------------------
=item seek POS,WHENCE
I<Instance method.>
Seek to a given position in the stream.
Only a WHENCE of 0 (SEEK_SET) is supported.
=cut
sub seek {
my ($self, $pos, $whence) = @_;
### Seek:
if ($whence == 0) { $self->_seek_set($pos); }
elsif ($whence == 1) { $self->_seek_cur($pos); }
elsif ($whence == 2) { $self->_seek_end($pos); }
else { croak "bad seek whence ($whence)" }
return 1;
}
#------------------------------
#
# _seek_set POS
#
# Instance method, private.
# Seek to $pos relative to start:
#
sub _seek_set {
my ($self, $pos) = @_;
### Advance through array until done:
my $istr = 0;
while (($pos >= 0) && ($istr < scalar(@{*$self->{AR}}))) {
if (length(*$self->{AR}[$istr]) > $pos) { ### it's in this string!
return $self->setpos([$istr, $pos]);
}
else { ### it's in next string
$pos -= length(*$self->{AR}[$istr++]); ### move forward one string
}
}
### If we reached this point, pos is at or past end; zoom to EOF:
return $self->_setpos_to_eof;
}
#------------------------------
#
# _seek_cur POS
#
# Instance method, private.
# Seek to $pos relative to current position.
#
sub _seek_cur {
my ($self, $pos) = @_;
$self->_seek_set($self->tell + $pos);
}
#------------------------------
#
# _seek_end POS
#
# Instance method, private.
# Seek to $pos relative to end.
# We actually seek relative to beginning, which is simple.
#
sub _seek_end {
my ($self, $pos) = @_;
$self->_seek_set($self->_tell_eof + $pos);
}
#------------------------------
=item tell
I<Instance method.>
Return the current position in the stream, as a numeric offset.
=cut
sub tell {
my $self = shift;
my $off = 0;
my ($s, $str_s);
for ($s = 0; $s < *$self->{Str}; $s++) { ### count all "whole" scalars
defined($str_s = *$self->{AR}[$s]) or $str_s = '';
###print STDERR "COUNTING STRING $s (". length($str_s) . ")\n";
$off += length($str_s);
}
###print STDERR "COUNTING POS ($self->{Pos})\n";
return ($off += *$self->{Pos}); ### plus the final, partial one
}
#------------------------------
#
# _tell_eof
#
# Instance method, private.
# Get position of EOF, as a numeric offset.
# This is identical to the size of the stream - 1.
#
sub _tell_eof {
my $self = shift;
my $len = 0;
foreach (@{*$self->{AR}}) { $len += length($_) }
$len;
}
#------------------------------
=item setpos POS
I<Instance method.>
Seek to a given position in the array, using the opaque getpos() value.
Don't expect this to be a number.
=cut
sub setpos {
my ($self, $pos) = @_;
(ref($pos) eq 'ARRAY') or
die "setpos: only use a value returned by getpos!\n";
(*$self->{Str}, *$self->{Pos}) = @$pos;
}
#------------------------------
#
# _setpos_to_eof
#
# Fast-forward to EOF.
#
sub _setpos_to_eof {
my $self = shift;
$self->setpos([scalar(@{*$self->{AR}}), 0]);
}
#------------------------------
=item getpos
I<Instance method.>
Return the current position in the array, as an opaque value.
Don't expect this to be a number.
=cut
sub getpos {
[*{$_[0]}->{Str}, *{$_[0]}->{Pos}];
}
#------------------------------
=item aref
I<Instance method.>
Return a reference to the underlying array.
=cut
sub aref {
*{shift()}->{AR};
}
=back
=cut
#------------------------------
# Tied handle methods...
#------------------------------
### Conventional tiehandle interface:
sub TIEHANDLE { (defined($_[1]) && UNIVERSAL::isa($_[1],"IO::ScalarArray"))
? $_[1]
: shift->new(@_) }
sub GETC { shift->getc(@_) }
sub PRINT { shift->print(@_) }
sub PRINTF { shift->print(sprintf(shift, @_)) }
sub READ { shift->read(@_) }
sub READLINE { wantarray ? shift->getlines(@_) : shift->getline(@_) }
sub WRITE { shift->write(@_); }
sub CLOSE { shift->close(@_); }
sub SEEK { shift->seek(@_); }
sub TELL { shift->tell(@_); }
sub EOF { shift->eof(@_); }
sub BINMODE { 1; }
#------------------------------------------------------------
1;
__END__
# SOME PRIVATE NOTES:
#
# * The "current position" is the position before the next
# character to be read/written.
#
# * Str gives the string index of the current position, 0-based
#
# * Pos gives the offset within AR[Str], 0-based.
#
# * Inital pos is [0,0]. After print("Hello"), it is [1,0].
=head1 WARNINGS
Perl's TIEHANDLE spec was incomplete prior to 5.005_57;
it was missing support for C<seek()>, C<tell()>, and C<eof()>.
Attempting to use these functions with an IO::ScalarArray will not work
prior to 5.005_57. IO::ScalarArray will not have the relevant methods
invoked; and even worse, this kind of bug can lie dormant for a while.
If you turn warnings on (via C<$^W> or C<perl -w>),
and you see something like this...
attempt to seek on unopened filehandle
...then you are probably trying to use one of these functions
on an IO::ScalarArray with an old Perl. The remedy is to simply
use the OO version; e.g.:
$AH->seek(0,0); ### GOOD: will work on any 5.005
seek($AH,0,0); ### WARNING: will only work on 5.005_57 and beyond
=head1 VERSION
$Id: ScalarArray.pm,v 1.7 2005/02/10 21:21:53 dfs Exp $
=head1 AUTHOR
=head2 Primary Maintainer
Dianne Skoll (F<dfs@roaringpenguin.com>).
=head2 Principal author
Eryq (F<eryq@zeegee.com>).
President, ZeeGee Software Inc (F<http://www.zeegee.com>).
=head2 Other contributors
Thanks to the following individuals for their invaluable contributions
(if I've forgotten or misspelled your name, please email me!):
I<Andy Glew,>
for suggesting C<getc()>.
I<Brandon Browning,>
for suggesting C<opened()>.
I<Eric L. Brine,>
for his offset-using read() and write() implementations.
I<Doug Wilson,>
for the IO::Handle inheritance and automatic tie-ing.
=cut
#------------------------------
1;