Pretrained keras model is returing the same result in android - android
I have created an image classifier in Keras, later I saved the model in pb format to use it in android.
However, in the python code, it can classify the image properly. But in android whatever image I gave as input the output is always the same .
This is how I have trained my model
rom keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
# Initialising the CNN
classifier = Sequential()
# Step 1 - Convolution
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))
# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Adding a second convolutional layer
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Step 3 - Flattening
classifier.add(Flatten())
# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 1, activation = 'sigmoid'))
# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
# Part 2 - Fitting the CNN to the images
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
test_set = test_datagen.flow_from_directory('dataset/test_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
classifier.fit_generator(training_set,
steps_per_epoch = 8000,
epochs = 25,
validation_data = test_set,
validation_steps = 2000)
classifier.summary()
classifier.save('saved_model.h5')
Later I convert that keras model(saved_model.h5) to tensorflow model by using this
This is how I have converted my bitmap float array
public static float[] getPixels(Bitmap bitmap) {
final int IMAGE_SIZE = 64;
int[] intValues = new int[IMAGE_SIZE * IMAGE_SIZE];
float[] floatValues = new float[IMAGE_SIZE * IMAGE_SIZE * 3];
if (bitmap.getWidth() != IMAGE_SIZE || bitmap.getHeight() != IMAGE_SIZE) {
// rescale the bitmap if needed
bitmap = ThumbnailUtils.extractThumbnail(bitmap, IMAGE_SIZE, IMAGE_SIZE);
}
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
// bitwise shifting - without our image is shaped [1, 64, 64, 1] but we need [1, 168, 168, 3]
floatValues[i * 3 + 2] = Color.red(val) / 255.0f;
floatValues[i * 3 + 1] = Color.green(val) / 255.0f;
floatValues[i * 3] = Color.blue(val) / 255.0f;
}
return floatValues;
}
Later, I tried to classify image using tensorflow in android , like following .
TensorFlowInferenceInterface tensorFlowInferenceInterface;
tensorFlowInferenceInterface = new TensorFlowInferenceInterface(getAssets(),"model.pb");
float[] output = new float[2];
tensorFlowInferenceInterface.feed("conv2d_11_input",
getPixels(bitmap), 1,64,64,3);
tensorFlowInferenceInterface.run(new String[]{"dense_12/Sigmoid"});
tensorFlowInferenceInterface.fetch("dense_12/Sigmoid",output);
Whatever image I gave the value of the output is [1,0]
Is there anything I have missed?
The color components returned by Color.red(int), Color.blue(int) and Color.green(int) are integers in the range [0, 255] (see doc). The same thing holds when reading images using ImageDataGenerator of Keras. However, as I stated in comments section, in prediction phase you need to do the same preprocessing steps as done in training phase. You are scaling the image pixels by 1./255 in training (using rescale = 1./255 in ImageDataGenerator) and therefore, according to the first point I mentioned, this must also be done in prediction:
floatValues[i * 3 + 2] = Color.red(val) / 255.0;
floatValues[i * 3 + 1] = Color.green(val) / 255.0;
floatValues[i * 3] = Color.blue(val) / 255.0;
Related
pytorch KAIR example on Android
I stuck trying to trace/scipt ffdnet KAIR's model to android. Model's forward looks like: def forward(self, x): #, paddingBottom, paddingRight): #, sigma): noise_level_model = 15 sigma = torch.full((1, 1, 1, 1), noise_level_model / 255.).type_as(x) h, w = x.size()[-2:] paddingBottom = int(np.ceil(h/2)*2-h) paddingRight = int(np.ceil(w/2)*2-w) x = torch.nn.ReplicationPad2d((0, paddingRight, 0, paddingBottom))(x) x = self.m_down(x) # m = torch.ones(sigma.size()[0], sigma.size()[1], x.size()[-2], x.size()[-1]).type_as(x).mul(sigma) m = sigma.repeat(1, 1, x.size()[-2], x.size()[-1]) x = torch.cat((x, m), 1) x = self.model(x) x = self.m_up(x) x = x[..., :h, :w] return x If I trace that I get some warnings about padding arguments but model works on Android. Problem is that it isn't work with input of different sizes, only size same as 'test1.jpeg': model_name = 'ffdnet_color' model_pool = 'model_zoo' model_path = os.path.join(model_pool, model_name + '.pth') n_channels = 3 nc = 96 nb = 12 device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = net(in_nc=n_channels, out_nc=n_channels, nc=nc, nb=nb, act_mode='R') model.load_state_dict(torch.load(model_path), strict=True) model.eval() for k, v in model.named_parameters(): v.requires_grad = False model = model.to(device) img = 'testsets/myset/test1.jpeg' img_name, ext = os.path.splitext(os.path.basename(img)) img_L = util.imread_uint(img, n_channels=n_channels) img_L = util.uint2single(img_L) noise_level_model = 15 img_L = util.single2tensor4(img_L) img_L = img_L.to(device) sigma_ = torch.full((1, 1, 1, 1), noise_level_model / 255) sigma = torch.full((1, 1, 1, 1), noise_level_model / 255.).type_as(img_L) traced_model = torch.jit.trace(model, img_L) traced_optimized = optimize_for_mobile(traced_model) save_path = os.path.splitext(os.path.basename(model_path))[0] + '-mobile.pth' traced_optimized.save(save_path) I've tried to script model with traced_model = torch.jit.script(model) but got errors: TypeError: cannot create weak reference to 'numpy.ufunc' object What should I do to achieve model works with different input size on mobile?
I encountered a similar issue. It was due to my model using numpy math operations (which are numpy.ufunc). I fixed the issue by replacing all of numpy ufuncs (i.e. np.add, np.ceil, and +, - etc on ndarrays) with corresponding torch versions (i.e. torch.add, torch.sub etc).
How to pass image to tflite model in android
I have converted a Yolo model to .tflite for use in android. This is how it was used in python - net = cv2.dnn.readNet("yolov2.weights", "yolov2.cfg") classes = [] with open("yolov3.txt", "r") as f: classes = [line.strip() for line in f.readlines()] layer_names = net.getLayerNames() output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()] colors = np.random.uniform(0, 255, size=(len(classes), 3)) cap= cv2.VideoCapture(0) while True: _,frame= cap.read() height,width,channel= frame.shape blob = cv2.dnn.blobFromImage(frame, 0.00392, (320, 320), (0, 0, 0), True, crop=False) net.setInput(blob) outs = net.forward(output_layers) for out in outs: for detection in out: scores = detection[5:] class_id = np.argmax(scores) confidence = scores[class_id] if confidence > 0.2: # Object detected center_x = int(detection[0] * width) center_y = int(detection[1] * height) w = int(detection[2] * width) h = int(detection[3] * height) # Rectangle coordinates x = int(center_x - w / 2) y = int(center_y - h / 2) I used netron https://github.com/lutzroeder/netron to visualize the model. The input is described as name: inputs, type: float32[1,416,416,3], quantization: 0 ≤ q ≤ 255, location: 399 and the output as name: output_boxes, type: float32[1,10647,8], location: 400. My problem is regarding using this model in android. I have loaded the model in "Interpreter tflite", I am getting the input frames from the camera in byte[] format. How can I convert it into the required input for tflite.run(input, output)?
You need to resize the input image to match with the input size of TensorFlow-Lite model, and then convert it to RGB format to feed to the model. By using the ImageProcessor from TensorFlow-Lite Support Library, you can easily do image resizing and conversion. ImageProcessor imageProcessor = new ImageProcessor.Builder() .add(new ResizeWithCropOrPadOp(cropSize, cropSize)) .add(new ResizeOp(imageSizeX, imageSizeY, ResizeMethod.NEAREST_NEIGHBOR)) .add(new Rot90Op(numRoration)) .add(getPreprocessNormalizeOp()) .build(); return imageProcessor.process(inputImageBuffer); Next to run inference with the interpreter, you feed the preprocessed image to the TensorFlow-Lite interpreter: tflite.run(inputImageBuffer.getBuffer(), outputProbabilityBuffer.getBuffer().rewind()); Refer this official example for more details, additionally you can refer this example as well.
How to fix the image preprocessing difference between tensorflow and android studio?
I'm trying to build a classification model with keras and deploy the model to my Android phone. I use the code from this website to deploy my own converted model, which is a .pb file, to my Android phone. I load a image from my phone and everything worked fine, but the prediction result is totally different from the result I got from my PC. The procedure of testing on my PC are: load the image with cv2, and convert to np.float32 use the keras resnet50 'preprocess_input' python function to preprocess the image expand the image dimension for batching (batch size is 1) forward the image to model and get the result Relevant code: img = cv2.imread('./my_test_image.jpg') x = preprocess_input(img.astype(np.float32)) x = np.expand_dims(x, axis=0) net = load_model('./my_model.h5') prediction_result = net.predict(x) And I noticed that the image preprocessing part of Android is different from the method I used in keras, which mode is caffe(convert the images from RGB to BGR, then zero-center each color channel with respect to the ImageNet dataset). It seems that the original code is for mode tf(will scale pixels between -1 to 1). So I modified the following code of 'preprocessBitmap' to what I think it should be, and use a 3 channel RGB image with pixel value [127,127,127] to test it. The code predicted the same result as .h5 model did. But when I load a image to classify, the prediction result is different from .h5 model. Does anyone has any idea? Thank you very much. I have tried the following: Load a 3 channel RGB image in my Phone with pixel value [127,127,127], and use the modified code below, and it will give me a prediction result that is same as prediction result using .h5 model on PC. Test the converted .pb model on PC using tensorflow gfile module with a image, and it give me a correct prediction result (compare to .h5 model). So I think the converted .pb file does not have any problem. Entire section of preprocessBitmap // code of 'preprocessBitmap' section in TensorflowImageClassifier.java TraceCompat.beginSection("preprocessBitmap"); // Preprocess the image data from 0-255 int to normalized float based // on the provided parameters. bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { // this is a ARGB format, so we need to mask the least significant 8 bits to get blue, and next 8 bits to get green and next 8 bits to get red. Since we have an opaque image, alpha can be ignored. final int val = intValues[i]; // original /* floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; */ // what I think it should be to do the same thing in mode caffe when using keras floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - (float)123.68); floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - (float)116.779); floatValues[i * 3 + 2] = (((val & 0xFF)) - (float)103.939); } TraceCompat.endSection();
This question is old, but remains the top Google result for preprocess_input for ResNet50 on Android. I could not find an answer for implementing preprocess_input for Java/Android, so I came up with the following based on the original python/keras code: /* Preprocesses RGB bitmap IAW keras/imagenet Port of https://github.com/tensorflow/tensorflow/blob/v2.3.1/tensorflow/python/keras/applications/imagenet_utils.py#L169 with data_format='channels_last', mode='caffe' Convert the images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling. Returns 3D float array */ static float[][][] imagenet_preprocess_input_caffe( Bitmap bitmap ) { // https://github.com/tensorflow/tensorflow/blob/v2.3.1/tensorflow/python/keras/applications/imagenet_utils.py#L210 final float[] imagenet_means_caffe = new float[]{103.939f, 116.779f, 123.68f}; float[][][] result = new float[bitmap.getHeight()][bitmap.getWidth()][3]; // assuming rgb for (int y = 0; y < bitmap.getHeight(); y++) { for (int x = 0; x < bitmap.getWidth(); x++) { final int px = bitmap.getPixel(x, y); // rgb-->bgr, then subtract means. no scaling result[y][x][0] = (Color.blue(px) - imagenet_means_caffe[0] ); result[y][x][1] = (Color.green(px) - imagenet_means_caffe[1] ); result[y][x][2] = (Color.red(px) - imagenet_means_caffe[2] ); } } return result; } Usage with a 3D tensorflow-lite input with shape (1,224,224,3): Bitmap bitmap = <your bitmap of size 224x224x3>; float[][][][] imgValues = new float[1][bitmap.getHeight()][bitmap.getWidth()][3]; imgValues[0]=imagenet_preprocess_input_caffe(bitmap); ... <prep tfInput, tfOutput> ... tfLite.run(tfInput, tfOutput);
Same Tensorflow model giving different results on Android and Python
I am trying to run a Tensorflow model on my Android application, but the same trained model gives different results (wrong inference) compared to when it is run on Python on desktop. The model is a simple sequential CNN to recognize characters, much like this number plate recognition network, minus the windowing, as my model has the characters already cropped into place. I have: Model saved in protobuf (.pb) file - modeled and trained in Keras on Python/Linux + GPU The inference was tested on a different computer on pure Tensorflow, to make sure Keras was not the culprit. Here, the results were as expected. Tensorflow 1.3.0 is being used on Python and Android. Installed from PIP on Python and jcenter on Android. The results on Android do not resemble the expected outcome. The input is a 129*45 RGB image, so a 129*45*3 array, and the output is a 4*36 array (representing 4 characters from 0-9 and a-z). I used this code to save the Keras model as a .pb file. Python code, this works as expected: test_image = [ndimage.imread("test_image.png", mode="RGB").astype(float)/255] imTensor = np.asarray(test_image) def load_graph(model_file): graph = tf.Graph() graph_def = tf.GraphDef() with open(model_file, "rb") as f: graph_def.ParseFromString(f.read()) with graph.as_default(): tf.import_graph_def(graph_def) return graph graph=load_graph("model.pb") with tf.Session(graph=graph) as sess: input_operation = graph.get_operation_by_name("import/conv2d_1_input") output_operation = graph.get_operation_by_name("import/output_node0") results = sess.run(output_operation.outputs[0], {input_operation.outputs[0]: imTensor}) Android code, based on this example; this gives seemingly random results: Bitmap bitmap; try { InputStream stream = getAssets().open("test_image.png"); bitmap = BitmapFactory.decodeStream(stream); } catch (IOException e) { e.printStackTrace(); } inferenceInterface = new TensorFlowInferenceInterface(context.getAssets(), "model.pb"); int[] intValues = new int[129*45]; float[] floatValues = new float[129*45*3]; String outputName = "output_node0"; String[] outputNodes = new String[]{outputName}; float[] outputs = new float[4*36]; bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = ((val >> 16) & 0xFF) / 255; floatValues[i * 3 + 1] = ((val >> 8) & 0xFF) / 255; floatValues[i * 3 + 2] = (val & 0xFF) / 255; } inferenceInterface.feed("conv2d_1_input", floatValues, 1, 45, 129, 3); inferenceInterface.run(outputNodes, false); inferenceInterface.fetch(outputName, outputs); Any help is greatly appreciated!
One Problem is in the lines: floatValues[i * 3 + 0] = ((val >> 16) & 0xFF) / 255; floatValues[i * 3 + 1] = ((val >> 8) & 0xFF) / 255; floatValues[i * 3 + 2] = (val & 0xFF) / 255; where the RGB values are divided by an integer, thus yielding an integer result (namely 0 every time). Moreover, the division, even if executed with a 255.0 yielding a float between 0 and 1.0 may pose a problem, as the values aren't distributed in the projection space (0..1) like they were in Natura. To explain this: a value of 255 in the sensor domain (i.e. the R value for example) means that the natural value of the measured signal fell somewhere in the "255" bucket which is a whole range of energies/intensities/etc. Mapping this value to 1.0 will most likely cut half of its range, as subsequent calculations could saturate at a maximum multiplicator of 1.0 which really is only the midpoint of a +- 1/256 bucket. So maybe the transformation would be more correctly a mapping to the midpoints of a 256-bucket division of the 0..1 range: ((val & 0xff) / 256.0) + (0.5/256.0) but this is just a guess from my side.
A way to use blender animations with pygame to export it to android [duplicate]
I can't seem to find the answer to this question anywhere. I realize that you have to use PyOpenGL or something similar to do OpenGL stuff, but I was wondering if its possible to do very basic 3D graphics without any other dependencies.
No, Pygame is a wrapper for SDL, which is a 2D api. Pygame doesn't provide any 3D capability and probably never will. 3D libraries for Python include Panda3D and DirectPython, although they are probably quite complex to use, especially the latter.
Well, if you can do 2d you can always do 3d. All 3d really is is skewed 2 dimensional surfaces giving the impression you're looking at something with depth. The real question is can it do it well, and would you even want to. After browsing the pyGame documentation for a while, it looks like it's just an SDL wrapper. SDL is not intended for 3d programming, so the answer to the real question is, No, and I wouldn't even try.
You can do pseudo-3d games ( like "Doom" ) with pygame only: http://code.google.com/p/gh0stenstein/ and if you browse the pygame.org site you may find more "3d" games done with python and pygame. However, if you really want to go into 3d programming you should look into OpenGl, Blender or any other real 3d lib.
Pygame was never originally meant to do 3d, but there is a way you can do 3d with any 2d graphics library. All you need is the following function, which converts 3d points to 2d points, which allows you to make any 3d shape by just drawing lines on a screen. def convert_to_2d(point=[0,0,0]): return [point[0]*(point[2]*.3),point[1]*(point[2]*.3)] This is called pseudo 3d, or 2.5d. This can be done, but may be slow, and is extremely difficult to do, so it is suggested that you use a library meant for 3d.
It does not support, but combining with PyOpenGL you can make use of the power of both, here is a full example import pygame from pygame.locals import * from OpenGL.GL import * from OpenGL.GLU import * import random vertices = ((1, -1, -1),(1, 1, -1),(-1, 1, -1),(-1, -1, -1),(1, -1, 1),(1, 1, 1),(-1, -1, 1),(-1, 1, 1)) edges = ((0,1),(0,3),(0,4),(2,1),(2,3),(2,7),(6,3),(6,4),(6,7),(5,1),(5,4),(5,7)) surfaces = ((0,1,2,3),(3,2,7,6),(6,7,5,4),(4,5,1,0),(1,5,7,2),(4,0,3,6)) colors = ((1,0,0),(0,1,0),(0,0,1),(0,1,0),(1,1,1),(0,1,1),(1,0,0),(0,1,0),(0,0,1),(1,0,0),(1,1,1),(0,1,1),) def set_vertices(max_distance, min_distance = -20): x_value_change = random.randrange(-10,10) y_value_change = random.randrange(-10,10) z_value_change = random.randrange(-1*max_distance,min_distance) new_vertices = [] for vert in vertices: new_vert = [] new_x = vert[0] + x_value_change new_y = vert[1] + y_value_change new_z = vert[2] + z_value_change new_vert.append(new_x) new_vert.append(new_y) new_vert.append(new_z) new_vertices.append(new_vert) return new_vertices def Cube(vertices): glBegin(GL_QUADS) for surface in surfaces: x = 0 for vertex in surface: x+=1 glColor3fv(colors[x]) glVertex3fv(vertices[vertex]) glEnd() glBegin(GL_LINES) for edge in edges: for vertex in edge: glVertex3fv(vertices[vertex]) glEnd() def main(): pygame.init() display = (800,600) pygame.display.set_mode(display, DOUBLEBUF|OPENGL) max_distance = 100 gluPerspective(45, (display[0]/display[1]), 0.1, max_distance) glTranslatef(random.randrange(-5,5),random.randrange(-5,5), -40) #object_passed = False x_move = 0 y_move = 0 cube_dict = {} for x in range(50): cube_dict[x] =set_vertices(max_distance) #glRotatef(25, 2, 1, 0) x = glGetDoublev(GL_MODELVIEW_MATRIX) camera_x = x[3][0] camera_y = x[3][1] camera_z = x[3][2] button_down = False while True: for event in pygame.event.get(): if event.type == pygame.QUIT: pygame.quit() quit() if event.type == pygame.MOUSEMOTION: if button_down == True: print(pygame.mouse.get_pressed()) glRotatef(event.rel[1], 1, 0, 0) glRotatef(event.rel[0], 0, 1, 0) for event in pygame.mouse.get_pressed(): # print(pygame.mouse.get_pressed()) if pygame.mouse.get_pressed()[0] == 1: button_down = True elif pygame.mouse.get_pressed()[0] == 0: button_down = False glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT) for each_cube in cube_dict: Cube(cube_dict[each_cube]) pygame.display.flip() pygame.time.wait(10) main() pygame.quit() quit()
What you see as a 3D is actually a 2D game. After all, you are watching your screen, which (normally ;) ) is 2D. The virtual world (which is in 3D) is projected onto a plane, which is then shown on your screen. Our brains then convert that 2D image into a 3D one (like they do with the image of our eyes), making it look like it's 3D. So it's relatively easy to make a 3D game: you just create a virtual world using a multidimensional matrix and then project it each loop on a 2D plane, which you display on your screen. One tutorial that can put you on your way to 3D programs (using pygame) is this one .
3D rendering in Pygame without the help of other dependencies is hard to achieve and will not perform well. Pygame does not offer any functionality for drawing 3D shapes, meshes, or even perspective and lighting. If you want to draw a 3D scene with Pygame, you need to compute the vertices using vector arithmetic and stitch the geometry together using polygons. Example of then answer to Pygame rotating cubes around axis: This approach won't give a satisfying performance and is only valuable for studying. 3D scenes are generated with the help of the GPU. A CPU-only approach does not achieve the required performance. Nevertheless, nice results can be achieved with a 2.5-D approach. See the answer to How do I fix wall warping in my raycaster?: import pygame import math pygame.init() tile_size, map_size = 50, 8 board = [ '########', '# # #', '# # ##', '# ## #', '# #', '### ###', '# #', '########'] def cast_rays(sx, sy, angle): rx = math.cos(angle) ry = math.sin(angle) map_x = sx // tile_size map_y = sy // tile_size t_max_x = sx/tile_size - map_x if rx > 0: t_max_x = 1 - t_max_x t_max_y = sy/tile_size - map_y if ry > 0: t_max_y = 1 - t_max_y while True: if ry == 0 or t_max_x < t_max_y * abs(rx / ry): side = 'x' map_x += 1 if rx > 0 else -1 t_max_x += 1 if map_x < 0 or map_x >= map_size: break else: side = 'y' map_y += 1 if ry > 0 else -1 t_max_y += 1 if map_x < 0 or map_y >= map_size: break if board[int(map_y)][int(map_x)] == "#": break if side == 'x': x = (map_x + (1 if rx < 0 else 0)) * tile_size y = player_y + (x - player_x) * ry / rx direction = 'r' if x >= player_x else 'l' else: y = (map_y + (1 if ry < 0 else 0)) * tile_size x = player_x + (y - player_y) * rx / ry direction = 'd' if y >= player_y else 'u' return (x, y), math.hypot(x - sx, y - sy), direction def cast_fov(sx, sy, angle, fov, no_ofrays): max_d = math.tan(math.radians(fov/2)) step = max_d * 2 / no_ofrays rays = [] for i in range(no_ofrays): d = -max_d + (i + 0.5) * step ray_angle = math.atan2(d, 1) pos, dist, direction = cast_rays(sx, sy, angle + ray_angle) rays.append((pos, dist, dist * math.cos(ray_angle), direction)) return rays area_width = tile_size * map_size window = pygame.display.set_mode((area_width*2, area_width)) clock = pygame.time.Clock() board_surf = pygame.Surface((area_width, area_width)) for row in range(8): for col in range(8): color = (192, 192, 192) if board[row][col] == '#' else (96, 96, 96) pygame.draw.rect(board_surf, color, (col * tile_size, row * tile_size, tile_size - 2, tile_size - 2)) player_x, player_y = round(tile_size * 4.5) + 0.5, round(tile_size * 4.5) + 0.5 player_angle = 0 max_speed = 3 colors = {'r' : (196, 128, 64), 'l' : (128, 128, 64), 'd' : (128, 196, 64), 'u' : (64, 196, 64)} run = True while run: clock.tick(30) for event in pygame.event.get(): if event.type == pygame.QUIT: run = False keys = pygame.key.get_pressed() hit_pos_front, dist_front, side_front = cast_rays(player_x, player_y, player_angle) hit_pos_back, dist_back, side_back = cast_rays(player_x, player_y, player_angle + math.pi) player_angle += (keys[pygame.K_RIGHT] - keys[pygame.K_LEFT]) * 0.1 speed = ((0 if dist_front <= max_speed else keys[pygame.K_UP]) - (0 if dist_back <= max_speed else keys[pygame.K_DOWN])) * max_speed player_x += math.cos(player_angle) * speed player_y += math.sin(player_angle) * speed rays = cast_fov(player_x, player_y, player_angle, 60, 40) window.blit(board_surf, (0, 0)) for ray in rays: pygame.draw.line(window, (0, 255, 0), (player_x, player_y), ray[0]) pygame.draw.line(window, (255, 0, 0), (player_x, player_y), hit_pos_front) pygame.draw.circle(window, (255, 0, 0), (player_x, player_y), 8) pygame.draw.rect(window, (128, 128, 255), (400, 0, 400, 200)) pygame.draw.rect(window, (128, 128, 128), (400, 200, 400, 200)) for i, ray in enumerate(rays): height = round(10000 / ray[2]) width = area_width // len(rays) color = pygame.Color((0, 0, 0)).lerp(colors[ray[3]], min(height/256, 1)) rect = pygame.Rect(area_width + i*width, area_width//2-height//2, width, height) pygame.draw.rect(window, color, rect) pygame.display.flip() pygame.quit() exit() Also see PyGameExamplesAndAnswers - Raycasting I am aware that you asked "... but I was wondering if its possible to do very basic 3D graphics without any other dependencies.". Anyway, I will give you some additional options with other dependencies. One way to make 3D scenes more powerful in Python is to use an OpenGL based library like pyglet or ModernGL. However, you can use a Pygame window to create an OpenGL Context. You need to set the pygame.OPENGL flag when creating the display Surface (see pygame.display.set_mode): window = pg.display.set_mode((width, height), pygame.OPENGL | pygame.DOUBLEBUF) Modern OpenGL PyGame/PyOpenGL example: import pygame from OpenGL.GL import * from OpenGL.GLU import * from OpenGL.GL.shaders import * import ctypes import glm glsl_vert = """ #version 330 core layout (location = 0) in vec3 a_pos; layout (location = 1) in vec4 a_col; out vec4 v_color; uniform mat4 u_proj; uniform mat4 u_view; uniform mat4 u_model; void main() { v_color = a_col; gl_Position = u_proj * u_view * u_model * vec4(a_pos.xyz, 1.0); } """ glsl_frag = """ #version 330 core out vec4 frag_color; in vec4 v_color; void main() { frag_color = v_color; } """ class Cube: def __init__(self): v = [(-1,-1,-1), ( 1,-1,-1), ( 1, 1,-1), (-1, 1,-1), (-1,-1, 1), ( 1,-1, 1), ( 1, 1, 1), (-1, 1, 1)] edges = [(0,1), (1,2), (2,3), (3,0), (4,5), (5,6), (6,7), (7,4), (0,4), (1,5), (2,6), (3,7)] surfaces = [(0,1,2,3), (5,4,7,6), (4,0,3,7),(1,5,6,2), (4,5,1,0), (3,2,6,7)] colors = [(1,0,0), (0,1,0), (0,0,1), (1,1,0), (1,0,1), (1,0.5,0)] line_color = [0, 0, 0] edge_attributes = [] for e in edges: edge_attributes += v[e[0]] edge_attributes += line_color edge_attributes += v[e[1]] edge_attributes += line_color face_attributes = [] for i, quad in enumerate(surfaces): for iv in quad: face_attributes += v[iv] face_attributes += colors[i] self.edge_vbo = glGenBuffers(1) glBindBuffer(GL_ARRAY_BUFFER, self.edge_vbo) glBufferData(GL_ARRAY_BUFFER, (GLfloat * len(edge_attributes))(*edge_attributes), GL_STATIC_DRAW) self.edge_vao = glGenVertexArrays(1) glBindVertexArray(self.edge_vao) glVertexAttribPointer(0, 3, GL_FLOAT, False, 6*ctypes.sizeof(GLfloat), ctypes.c_void_p(0)) glEnableVertexAttribArray(0) glVertexAttribPointer(1, 3, GL_FLOAT, False, 6*ctypes.sizeof(GLfloat), ctypes.c_void_p(3*ctypes.sizeof(GLfloat))) glEnableVertexAttribArray(1) self.face_vbos = glGenBuffers(1) glBindBuffer(GL_ARRAY_BUFFER, self.face_vbos) glBufferData(GL_ARRAY_BUFFER, (GLfloat * len(face_attributes))(*face_attributes), GL_STATIC_DRAW) self.face_vao = glGenVertexArrays(1) glBindVertexArray(self.face_vao) glVertexAttribPointer(0, 3, GL_FLOAT, False, 6*ctypes.sizeof(GLfloat), ctypes.c_void_p(0)) glEnableVertexAttribArray(0) glVertexAttribPointer(1, 3, GL_FLOAT, False, 6*ctypes.sizeof(GLfloat), ctypes.c_void_p(3*ctypes.sizeof(GLfloat))) glEnableVertexAttribArray(1) def draw(self): glEnable(GL_DEPTH_TEST) glLineWidth(5) glBindVertexArray(self.edge_vao) glDrawArrays(GL_LINES, 0, 12*2) glBindVertexArray(0) glEnable(GL_POLYGON_OFFSET_FILL) glPolygonOffset( 1.0, 1.0 ) glBindVertexArray(self.face_vao) glDrawArrays(GL_QUADS, 0, 6*4) glBindVertexArray(0) glDisable(GL_POLYGON_OFFSET_FILL) def set_projection(w, h): return glm.perspective(glm.radians(45), w / h, 0.1, 50.0) pygame.init() window = pygame.display.set_mode((400, 300), pygame.DOUBLEBUF | pygame.OPENGL | pygame.RESIZABLE) clock = pygame.time.Clock() proj = set_projection(*window.get_size()) view = glm.lookAt(glm.vec3(0, 0, 5), glm.vec3(0, 0, 0), glm.vec3(0, 1, 0)) model = glm.mat4(1) cube = Cube() angle_x, angle_y = 0, 0 program = compileProgram( compileShader(glsl_vert, GL_VERTEX_SHADER), compileShader(glsl_frag, GL_FRAGMENT_SHADER)) attrib = { a : glGetAttribLocation(program, a) for a in ['a_pos', 'a_col'] } print(attrib) uniform = { u : glGetUniformLocation(program, u) for u in ['u_model', 'u_view', 'u_proj'] } print(uniform) glUseProgram(program) run = True while run: clock.tick(60) for event in pygame.event.get(): if event.type == pygame.QUIT: run = False elif event.type == pygame.VIDEORESIZE: glViewport(0, 0, event.w, event.h) proj = set_projection(event.w, event.h) model = glm.mat4(1) model = glm.rotate(model, glm.radians(angle_y), glm.vec3(0, 1, 0)) model = glm.rotate(model, glm.radians(angle_x), glm.vec3(1, 0, 0)) glUniformMatrix4fv(uniform['u_proj'], 1, GL_FALSE, glm.value_ptr(proj)) glUniformMatrix4fv(uniform['u_view'], 1, GL_FALSE, glm.value_ptr(view)) glUniformMatrix4fv(uniform['u_model'], 1, GL_FALSE, glm.value_ptr(model)) angle_x += 1 angle_y += 0.4 glClearColor(0.5, 0.5, 0.5, 1) glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT) cube.draw() pygame.display.flip() pygame.quit() exit() Legacy OpenGL PyGame/PyOpenGL example: import pygame from OpenGL.GL import * from OpenGL.GLU import * class Cube: def __init__(self): self.v = [(-1,-1,-1), ( 1,-1,-1), ( 1, 1,-1), (-1, 1,-1), (-1,-1, 1), ( 1,-1, 1), ( 1, 1, 1), (-1, 1, 1)] self.edges = [(0,1), (1,2), (2,3), (3,0), (4,5), (5,6), (6,7), (7,4), (0,4), (1,5), (2,6), (3,7)] self.surfaces = [(0,1,2,3), (5,4,7,6), (4,0,3,7),(1,5,6,2), (4,5,1,0), (3,2,6,7)] self.colors = [(1,0,0), (0,1,0), (0,0,1), (1,1,0), (1,0,1), (1,0.5,0)] def draw(self): glEnable(GL_DEPTH_TEST) glLineWidth(5) glColor3fv((0, 0, 0)) glBegin(GL_LINES) for e in self.edges: glVertex3fv(self.v[e[0]]) glVertex3fv(self.v[e[1]]) glEnd() glEnable(GL_POLYGON_OFFSET_FILL) glPolygonOffset( 1.0, 1.0 ) glBegin(GL_QUADS) for i, quad in enumerate(self.surfaces): glColor3fv(self.colors[i]) for iv in quad: glVertex3fv(self.v[iv]) glEnd() glDisable(GL_POLYGON_OFFSET_FILL) def set_projection(w, h): glMatrixMode(GL_PROJECTION) glLoadIdentity() gluPerspective(45, w / h, 0.1, 50.0) glMatrixMode(GL_MODELVIEW) def screenshot(display_surface, filename): size = display_surface.get_size() buffer = glReadPixels(0, 0, *size, GL_RGBA, GL_UNSIGNED_BYTE) screen_surf = pygame.image.fromstring(buffer, size, "RGBA") pygame.image.save(screen_surf, filename) pygame.init() window = pygame.display.set_mode((400, 300), pygame.DOUBLEBUF | pygame.OPENGL | pygame.RESIZABLE) clock = pygame.time.Clock() set_projection(*window.get_size()) cube = Cube() angle_x, angle_y = 0, 0 run = True while run: clock.tick(60) take_screenshot = False for event in pygame.event.get(): if event.type == pygame.QUIT: run = False elif event.type == pygame.VIDEORESIZE: glViewport(0, 0, event.w, event.h) set_projection(event.w, event.h) elif event.type == pygame.KEYDOWN: take_screenshot = True glLoadIdentity() glTranslatef(0, 0, -5) glRotatef(angle_y, 0, 1, 0) glRotatef(angle_x, 1, 0, 0) angle_x += 1 angle_y += 0.4 glClearColor(0.5, 0.5, 0.5, 1) glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT) cube.draw() if take_screenshot: screenshot(window, "cube.png") pygame.display.flip() pygame.quit() exit()
You can make like this : def convert_2d(x, y, z, horizon): d = 1 - (z/horizon) return x*d, y*d def draw_list_of_points(lst): '''Assume that lst is a list of 3 dimentionnal points like [(0, 0, 0), (1, 6, 2)... Let's take 200 for the horizon, it can give us a pretty clean 3D''' for x, y, z in lst: pygame.draw.circle(screen, color, convert_2d(x, y, z, 200), 1) But it's not very fast. If you want fast try to implement in C++/SDL2 or C. Pygame is not very good for 3d graphics.
It is easy to make 3D driver for PyGame. PyGame has some assets for 3D game development. I am developing Py3D driver using PyGame now. When I finish, I'll show you link to download Py3D. I tried to make 3D game with PyGame, and I needed just small addon for PyGame. It is wrong you think you must use SDL, PyOpenGL, OpenGL, PyQt5, Tkinter. All of them are wrong for making 3D games. OpenGL and PyOpenGL or Panda3D are very hard to learn. All my games made on those drivers were awful. PyQt5 and Tkinter aren't drivers for making games, but they've got addons for it. Don't try to make any game on those drivers. All drivers where we need to use the math module are hard. You can easily make small addon for them, I think everybody can make driver for PyGame in 1-2 weeks.
If you want to stick with a python-esque language when making games, Godot is a good alternative with both 2D and 3D support, a large community, and lots of tutorials. Its custom scripting language(gdscript) has some minor differences, but overall its mostly the same. It also has support for c# and c++, and has much more features when it comes to game development.
Pygame is just a library for changing the color of pixels (and some other useful stuff for programming games). You can do this by blitting images to the screen or directly setting the colors of pixels. Because of this, it is easy to write 2D games with pygame, as the above is all you really need. But a 3D game is just some 3D objects 'squashed' (rendered) into 2D so that it can be displayed on the screen. So, to make a 3D game using only pygame, you would have handle this rendering by yourself, including all the complex matrix maths necessary. Not only would this run slowly because of the immense processing power involved in this, but it would require you to write a massive 3D rendering/rasterisation engine. And because of python being interpreted it would be even slower. The correct approach would be to have this process run on the GPU using (Py)opengl. So, yes it is technically possible to do 3D using only pygame, but definitely not recommended. I would suggest you learn Panda3D or some similar 3D engine.
Simple: Just draw a bunch of polygons like: import pygame screen = pygame.display.set_mode((100, 100)) While True: screen.fill((0, 0, 0)) Pos = [(10, 10), (20, 10), (20, 20), (10, 20)] # first side (the front) in red pygame.draw.polygon(screen, (225, 0, 0), Pos) # outline in white pygame.draw.lines(screen, (225, 225, 225), Pos) # Second side (the back) in blue Pos2 = [(Pos[0[0]] + 2.5, Pos[0[1]] + 2.5), (Pos2[0[0]] + 5, Pos2[0[1]]), (Pos2[1[0]], Pos2[1[1]] + 5), (Pos2[0[0]], Pos2[0[1]] + 5)] pygame.draw.polygon(screen, (0, 0, 225), Pos2) pygame.draw.lines(screen, (225, 225, 225), Pos2) # Third side (the left but just 2 lines(not really)) in green Pos3 = [Pos[0], Pos2[0], Pos2[3], Pos[3]] pygame.draw.polygon(screen, (0, 225, 0), Pos3) pygame.draw.lines(screen, (225, 225, 225), Pos3) # Fourth side (the right) in purple Pos4 = [Pos[1], Pos2[1], Pos2[2], Pos[2]] pygame.draw.polygon(screen, (225, 0, 225), Pos4) pygame.draw.lines(screen, (225, 225, 225), Pos4) pygame.display.flip() & there is a simple cube & I will soon provide a link for the full code to be able to rotate the cube & resize it This should give you an equivalent of what you would get by using OpenGL
This is what I have managed to do with just Pygame and Numpy without using OpenGL with some basic shading. You can do 3D in PyGame but probably isn't the most efficient and fastest.